Australian  Government 
Department  of  Defence 

Defence  Science  and 
Technology  Organisation 


A  Course  of  Lectures  on  Statistical  Mechanics 

Don  Koks 

Electronic  Warfare  and  Radar  Division 
Defence  Science  and  Technology  Organisation 

DSTO—GD— 0612 


ABSTRACT 

This  is  a  set  of  lectures  given  by  the  author  in  2009  at  Flinders  University,  Ade¬ 
laide,  comprising  one  semester  of  a  third- year  undergraduate  course  in  physics. 
The  lectures  begin  with  an  introduction  to  the  theoretical  background  of  sta¬ 
tistical  mechanics,  and  then  continue  with  a  mixture  of  theory  and  application. 
Topics  covered  are  those  that  comprise  the  standard  tool  kit  for  advanced  study 
in  the  field. 
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A  Course  of  Lectures  on  Statistical  Mechanics 

Executive  Summary 

These  lectures  were  given  by  the  author  in  2009  for  a  one-semester  course  at  Flinders  Uni¬ 
versity,  Adelaide,  as  part  of  that  university’s  third-year  undergraduate  course  in  physics. 

The  lecture  notes  begin  with  an  introduction  to  the  mathematical  background  of  sta¬ 
tistical  mechanics.  They  introduce  the  all-important  notion  of  entropy,  which  leads  to  the 
concept  of  temperature  and  then  to  the  basics  of  thermodynamics.  Following  this  is  an 
excursion  into  the  physical  chemistry  of  dissolved  salts,  and  the  idea  of  a  reaction  attaining 
equilibrium.  The  Boltzmann  distribution  is  then  introduced;  this  serves  as  the  departure 
point  for  a  study  of  systems  interacting  with  an  environment.  The  relevant  ideas  allow 
the  entropy  of  more  complex  and  non-isolated  systems  to  be  calculated.  At  this  point  we 
include  a  discussion  of  the  approach  to  entropy  advocated  by  E.T.  Jaynes,  who  was  at  the 
forefront  of  advanced  ideas  in  statistical  mechanics  throughout  the  twentieth  century. 

Several  standard  topics  are  then  covered:  the  Maxwell  speed  and  velocity  distributions, 
and  the  theory  of  transport  processes  that  successfully  interrelates  thermal  conductivity, 
viscosity,  and  heat  capacity;  this  success  was  historically  of  prime  importance  to  the  for¬ 
mation  of  an  atomic  view  of  matter  in  physics. 

The  notes  end  with  a  discussion  of  quantum  statistics,  blackbody  radiation,  electric 
conductivity,  and  semiconductors. 

It  should  be  emphasised  that,  course  notes  being  what  they  are,  the  following  pages 
cover  the  main  ideas  briskly.  In  particular,  no  pictures  have  been  included,  although  of 
course  many  were  drawn  in  the  lectures  to  aid  in  the  presentation. 
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1  Introduction 

These  lectures  begin  with  an  introduction  to  the  mathematical  background  of  statistical 
mechanics.  They  introduce  the  all-important  notion  of  the  entropy  of  an  isolated  system. 
This  leads  to  the  concept  of  temperature  and  then  to  the  basics  of  thermodynamics. 
Following  this  is  an  excursion  into  chemical  concepts  to  do  with  dissolved  salts  and  the 
idea  of  a  reaction  attaining  equilibrium.  Next,  the  Boltzmann  Distribution  is  introduced; 
this  serves  as  the  departure  point  for  the  study  of  systems  interacting  with  an  environment 
of  which  we  might  know  nothing.  The  ideas  here  give  meaning  to  the  entropy  of  more 
complex  and  non-isolated  systems,  and  allow  it  to  be  calculated.  At  this  point  we  include 
a  discussion  of  the  ideas  of  E.T.  Jaynes,  who  was  at  the  forefront  of  advanced  ideas  in 
statistical  mechanics  throughout  the  twentieth  century. 

Following  this,  several  standard  topics  are  covered:  the  Maxwell  speed  and  velocity 
distributions,  and  the  theory  of  transport  processes  that  successfully  interrelates  thermal 
conductivity,  viscosity,  and  heat  capacity;  this  success  was  historically  of  prime  importance 
in  the  formation  of  an  atomic  view  of  matter  in  physics. 

The  notes  end  with  a  discussion  of  quantum  statistics,  blackbody  radiation,  electric 
conductivity,  and  semiconductors. 

The  general  ordering  of  the  subjects  here  follows  reference  [1].  However,  much  of  the 
mathematical  analysis  in  these  notes  follows  different  routes  from  those  in  that  book,  and 
other  content  has  been  added  to  these  notes. 

It  should  be  emphasised  that,  course  notes  being  what  they  are,  the  following  pages 
do  cover  the  main  ideas  briskly.  In  particular,  no  pictures  have  been  included,  although 
of  course  many  were  drawn  in  the  lectures  to  aid  in  the  presentation. 


2  Preliminaries  for  Counting  Large  Numbers 

Statistical  mechanics  is  built  on  the  idea  that  the  world  can  be  described  using  probability. 
Yet  on  the  surface  there  seems  to  be  little  randomness  in  the  world  around  us,  so  is  a 
probabilistic  description  really  such  a  good  idea? 

We  will  begin  to  answer  this  question  by  asking  something  more  basic:  given  a  set 
number  of  particles  of  a  gas  in  a  room,  what’s  the  chance  of  there  being  some  given  number 
of  particles  in  a  given  part  of  the  room?  Furthermore,  how  probable  are  fluctuations 
around  this  number?  It  will  turn  out  that  for  systems  with  large  numbers  of  particles  such 
as  we  find  in  everyday  life,  fluctuations  are  very  improbable  things.  This  indicates  that 
a  probabilistic  view  of  the  world  might  well  be  compatible  with  the  fact  that  we  don’t 
see  a  lot  of  randomness  around  us.  It’s  the  starting  point  for  the  subject  of  statistical 
mechanics. 


2.1  The  Binomial  Distribution 

It’s  simplest  to  divide  the  room  into  two  regions  and  find  probabilities  for  different  numbers 
of  the  particles  to  be  in  each  region.  This  is  the  job  of  the  binomial  distribution.  Given 
N  distinguishable  particles,  allocate  each  to  one  of  two  bins.  The  chance  of  a  particular 
particle  being  allocated  to  bin  1  is  p,  so  that  the  chance  of  a  particular  particle  being 
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allocated  to  bin  2  must  be  1  —  p.  What  is  the  chance  P(n)  that  we’ll  find  any  n  particles 
(without  regard  for  order)  in  bin  1? 

The  chance  that  any  such  combination  occurs,  with  n  particles  in  bin  1  and  N  —  n 
particles  in  bin  2,  is  pn  (1  —  p)N  n.  We  need  only  count  how  many  such  combinations 
there  can  be.  Do  this  by  labelling  the  particles  1,2 ,N  and  simply  writing  down  all 
possible  combinations.  We  can  do  this  systematically  by  writing  down  all  permutations 
as  if  the  particles  were  all  lined  up  in  a  row.  This  keeps  track  of  their  order,  which  allows 
us  to  count  them  more  easily  since  now  there  are  simply  IV!  possible  permutations.  For 
the  case  of  N  =  7  particles  in  total,  n  =  3  of  which  appear  in  bin  1,  we  might  write  all 
7!  permutations  as  (with  bin  1  written  first,  then  a  space,  then  bin  2) 

f  1  2  3  4  5  6  7  'i 

123  4576 


7!  rows 


132  4567 

132  4576 


>  3!  4!  rows 


(2.1) 


124  3567  ) 

1  2  4  3  5  7  6  >  3!  4!  rows 


etc. 


Each  combination  appears  3!  4!  times,  so  the  total  number  of  permutations,  7!,  over-counts 
the  number  of  combinations  by  this  factor.  Hence  the  number  of  combinations  is  7! / (3!  4!). 


Alternatively,  we  could  focus  on  bin  1  and  note  that  there  are  7x6x5  =  7!/4!  ways  of  putting 
three  particles  into  it  if  we  take  order  into  account  (i.e.  permutations);  to  count  combinations, 
we  must  correct  for  the  fact  that  each  combination  produces  3!  permutations,  so  must  divide  the 
number  of  permutations  by  3!  to  get  7! / (3!  4!) .  You  might  like  to  ponder  on  how  to  extend  this 
approach  to  the  case  of  many  bins  that  we’ll  examine  in  Section  11. 


More  generally,  the  total  number  of  combinations  is  N\/[n\  ( N  —  n)!],  also  written  NCn. 
Each  of  these  combinations  occurs  with  probability  pn  (1  —  p)N~n ,  so  the  final  sought-after 
probability  is 


P(n)  = 


N\ 


Pn  (1  -  p) 


N-n 


(2.2) 


n\  (N  —  n)\ 

This  function  of  the  number  of  particles  n  is  called  the  binomial  distribution. 

Example  1:  5  molecules  are  in  a  room.  What’s  the  chance  that  any  2  of  them  are  in 
the  front  1/s  at  some  chosen  moment? 


2  molecules  3  molecules 

_ i  i _ 

front  y.3  back  2/3 


Prob.  = 


51  (Vs)2  (2/s)3  ~  0.33  . 


2!  3! 


Answer 


(2.3) 


Example  2:  10  molecules  are  in  a  room.  What’s  the  chance  that  any  4  of  them  are 
in  the  front  s/3  at  some  chosen  moment? 


4  molecules  6  molecules 

i _ i  i _ i 


front  f/s 


back  2/3 
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Prob.  =  (V3)4  (2/3)6  —  0-23  .  Answer  (2-4) 

For  larger  factorials,  use  Stirling’s  rule: 

n\  ~  nn+1//2  e  n  \/2tt  .  (2-5) 

By  we  mean  lhs/rhs  — >  1  as  n  — >  oo,  although  their  difference  might  not  go  to  zero. 
In  particular, 


in  n\  ~  (n  +  V2)  in  n  —  n  +  in  \pht  H — - +  . . .  (2.6) 

12n  360n 

This  is  an  example  of  an  asymptotic  series.  To  see  how  it  differs  from  the  more  well  known 
convergent  series,  consider  how  a  convergent  series  is  used:  if  (2.6)  were  such  a  series,  we  could 
fix  n  and  ensure  convergence  by  letting  the  number  of  terms  go  to  infinity.  In  other  words,  we 
could  calculate  Inn!  to  any  accuracy  by  summing  a  sufficient  number  of  terms. 

In  contrast,  and  somewhat  bizarrely,  an  asymptotic  series  does  not  converge  in  this  way  for 
any  value  of  n.  The  coefficients  of  the  first  few  powers  of  n  in  (2.6)  start  out  by  decreasing  term  by 
term,  but  that  trend  soon  reverses  and  they  become  very  large.  For  any  choice  of  n,  they  soon  grow 
larger  at  a  faster  rate  than  the  (denominator)  powers  of  n,  so  that  the  series  can  never  converge. 

To  use  (2.6),  we  must  truncate  its  right-hand  side  after  some  arbitrary  number  of  terms,  and  then 
note  that  increasing  n  gives  a  different  sort  of  convergence:  lim  trunCat^d  rhs  =  That  means 
we  can’t  use  the  series  to  calculate  Inn!  to  arbitrary  accuracy.  Precisely  where  the  truncation 
might  best  be  made  to  maximise  the  accuracy  of  the  approximation  is  something  of  an  art. 


Example  3:  1000  molecules  are  in  a  room.  What’s  the  chance  that  any  400  of  them 
are  in  the  front  1  /.3  at  some  chosen  moment? 


L 


400  molecules 

_ i 

front  '/.3 


600  molecules 
back  2/3 


Prob. 


1000! 
400!  600! 


(l/3)400  (2/3)600  . 


(2.7) 


Then  In  prob.  ~  1000.5  In  1000  —  TOOG'+lmV^ 

-  400.5  In  400  +  T0Cf  -  W2ff 

-  600.5  In  600  +  £0tf  —  In  V2tt 

+  400  In  i/3  +  600  In  2/3  ~  -13.3716  , 

so  prob.  ~  1.559  X  10~6.  Answer  (2.8) 

(A  presumably  fairly  exact  answer  from  Mathematica  is  ~  1.558  x  10  6.) 

N.B.  Some  books  write  Inn!  ~  nlnn  —  n.  This  will  give  In  prob.  ~  —9.7  in  Example  3, 

so  is  clearly  not  accurate  in  this  case — and  it  becomes  more  and  more  inaccurate  as  n  — >  oo. 

24 

But  compare  this  expression  with  the  correct  one  when  n  ~  10“  ,  and  ask  yourself  whether 
it  might  in  fact  be  useful  after  all. 
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2.2  Quantifying  Fluctuations  in  the  Binomial 
Distribution 

Consider  TV  =  10  coins  flipped.  What  is  the  chance  that  half  land  heads  up? 

P( 5  heads)  =  (1/2)5  (1/2)5  ~  0.25  .  (2.9) 

5!  5! 

Now  for  TV  =  100: 

P(50  heads)  =  (1/2) 50  (Vs)50  ^  0.08 .  (2.10) 

And  now  TV  =  106: 

P(500,000  heads)  =  - 1’00i°-000!  ,  (i/2)500,ooo  „  „  0008  (2.n) 

Although  the  chance  of  getting  n  heads  is  maximal  if  n  =  TV/2,  it  goes  to  0  as  TV  — >  00. 
A  more  useful  question:  given  TV,  p.  what  is  the  value  of  n  where  P(n  heads)  peaks,  and 
what  is  a  good  measure  of  the  width  of  the  probability  distribution?  Or,  rather  than  ask 
where  the  peak  lies,  ask:  what  is  the  mean  number  of  heads  ,  often  written  n  or  ( n )? 

Basic  probability  theory  gives  n  =  pN.  Alternatively  we  can  use  a  first-principles 
approach  to  write 

N  N 

n  =  Y^  nP(n)  =  ^  n  NCn  pn  (1  -  p)N~n  .  (2.12) 

n= 0  n= 0 

This  looks  to  be  a  difficult  expression  to  evaluate.  But  we  can  do  it  using  a  kind  of  trick: 
replace  1  —  p  (when  it  appears  explicitly)  by  uq ”  and  treat  q  initially  as  an  independent 
variable,  only  setting  it  equal  to  1  —  p  at  the  end  of  the  calculation.  Now  make  use  of  two 
expressions: 

N 

(P  dp)k  pn  =  nk  pn  and  (p  +  q)N  =  ^  NCn  pnqN~n  ,  (2.13) 

n=0 

where  dp  =  d/dp.  Use  the  first  of  these  in  (2.12)  with  k  =  1,  then  the  second,  to  write 

n  =  ^  NCn  p dp pn  qN~n  =  pdp(p  +  q)N  =  pN(p  +  q)N~L  =  pN  ,  (2.14) 

n 

as  expected.  For  the  measure  of  fluctuation,  use  the  variance  a"  =  n2  —  n2.  Again 
use  (2.13) — now  with  k  =  2 — to  write 

n  =  p(n)  =  Z^n  CnP  q  =ypdp)  cnp  q 

n  n  n 

=  •  •  •  =  p2 N2  +  Np(l  —  p) .  (2-15) 

So  <72  =  Np(l  —  p)-  Define  the  relative  fluctuation  =  a /n  oc  l/V/V. 

Example  1.  10  molecules  in  a  room.  What  is  the  mean  number  in  the  front  third  of 
the  room,  and  what  is  its  relative  fluctuation? 

n  =  pN  =  V3  x  10  =  3  1  /.3  .  Answer 
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a  =  y/Np(l  —  p )  =  v/20/3,  so  a/n  ~  0.45  .  Answer 


(2.16) 


27 

Example  2.  Do  the  same  for  10"  molecules  in  a  room — a  realistic  figure. 

Answer 


n  =  pN  =  1/3  x  102' 

<r/h  =  /(l  -p)/n  =  D 


2/3 


1/3  X  10 


27 


~  4.5  x  10 


-14 


Answer 


(2.17) 


This  is  tiny!  Systems  with  stupendously  large  numbers  of  entities  are  very  predictable. 


2.3  The  Gaussian  Approximation  to  the  Binomial 
Distribution 

Calculating  P(n)  =  NCn  pn  (1  —  p)N  n  via  Stirling  is  tedious  and  doesn’t  give  a  feel  for  P(n ) 
Let’s  do  better  by  approximating  its  logarithm  by  a  Taylor  Series.  (Why  its  logarithm? 
Because  this  is  less  peaked  and  so  yields  a  better  approximation.) 


f(n)  =  In  P(n)  =  f(n)  +  f'(n)(n  -  n)  +  ^  f"(n)(n  -  n )2  + 


(2.18) 


f{n)  ~  In  N\  —  Inn!  —  ln(iV  —  n)!  +  nlnp+  ( N  —  n)  In q 

~  (. N  +  !/2)  In  N  —  (n  +  l/i)  In  n  —  ( N  —  n  +  1/2)  ln( N  —  n) 
—  In  v/27T  +  n  In  p  +  ( N  —  n)  In  q  . 


(2.19) 


Differentiating, 


fin )  ~  —  In  n - b  In  (N  —  n)  H — — - -  +  hip  —  In  q  . 

2  n  2  [N  —  n) 


\  -1  ,  1  1 

/  (n)  ~ - k  7  9  — 


1 


n  '  2n  N  -n  2  (AT  -  n)2  ’ 

Thus,  if  p  isn’t  close  to  0  or  1  (which  are  statistically  uninteresting  cases  anyway), 

/(h)  ~  -  In  V 2ircr‘2  ,  /(h)  ~  2p  ^  ,  /"(h)  ~  • 

2cr"  cr 


So,  with  x  =  n  —  n, 


/(„)  =  -  In  v'W-l— AlDA 

V  2a2 


Complete  the  square  to  give 


P(n)  oc  exp 


—  [n  —  (n  +  p  —  V2)]" 
2cr2 


(2.20) 


(2.21) 


(2.22) 


(2.23) 


Remember  that  1,  so  this  is  a  gaussian  centred  around  n  approximately,  with  width  a. 
It’s  usual  to  approximate  it  by 


1  -(n  -  n)2  J  n  =  Np 

P[n)  ~  — ^==  exp  ’ 


a 


y/2i r 


2(7 


2  ’ 


cr"  =  lVp(l  —  p) 


(2.24) 
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Example:  With  N  =  10  molecules  in  a  room,  what’s  the  chance  that  the  mean 
number  in  the  front  third  of  the  room  actually  occupies  that  front  third? 

N  is  so  large  here  that  we  should  use  (2.24).  The  mean  number  in  the  front  third  of  the 
27 

room  is  n  =  V3  x  10  .  Set  n  =  n  in  (2.24)  to  get 

a  =  \/20/3  x  1013,  P(n)  ~  — = - ^ - —  ~  3  x  10-14.  Answer  (2.25) 

\/20/3  x  1013  x  Vln  - 

What’s  the  chance  that  this  number  fluctuates  upwards  by  1%? 

P(1.01h)  ~  P(n)  exp  — (0-0^)  ^  p(f^  eXp  f \jA  x  io23^ 

2a  '  ' 

22 

«  10  10  Answer  (2.26) 

This  is  very  small:  even  a  1%  fluctuation  can  be  treated  as  never  occurring.  More  real¬ 
istically,  what’s  the  chance  that  the  occupation  number  fluctuates  by  at  least  1%  up  or 
down?  With  more  effort  we  can  show  that  the  answer  is  “close”  to  the  number  in  (2.26), 
but  we  won’t  do  that  calculation  here. 


2.4  Integrating  a  Gaussian  Function 

Gaussian  functions  are  common  throughout  probability  theory,  and  statistical  mechanics  is 
no  exception.  You  will  often  find  yourself  integrating  them,  so  here  is  a  good  place  to  write 
down  a  general  expression  for  the  gaussian  integral  in  terms  of  the  error  function  erf  x. 
First  in  one  dimension, 


/ 


e~ax  +bx  dx 


erf 


(2.27) 


This  expression  is  true  for  all  values  of  a  and  b — even  complex  ones.  To  help  visualise  it, 
note  that  erf  x  is  a  strictly  increasing  odd  function  over  the  reals.  It’s  shaped  much  like 
tan  1  x  for  real  x.  except  that  erf  oo  =  1.  A  special  case  of  (2.27)  is 

e~ax2+bxdx=  (2.28) 

V  a 


For  a  complex  integration,  it’s  useful  to  remember  that  erf(— z)  =  —  erf  z  for  all  complex  z, 
and  erf  z  — >  1  as  \z\  — »  oo,  provided  |  arg  z\  <  7r/4. 

The  corresponding  definite  integral  in  multi  dimensions  also  comes  in  handy.  Sup¬ 
pose  the  n  integration  variables  x1; . . .  ,xn  are  written  as  a  column  vector  x,  A  is  a  real 
symmetric  n  x  n  matrix,  and  b  is  a  column  vector.  Then,  with  “t”  denoting  transpose, 


—xtAx  +  b1 x 


dx i . . .  dxn 


7rn/2  exp  (b* A  lb/ 4^ 

\J  det  A 


(2.29) 
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3  Accessible  States  and  the  Fundamental 
Postulate  of  Statistical  Mechanics 

At  the  heart  of  statistical  mechanics  is  the  idea  of  counting  the  number  of  states  that 
a  system  can  occupy.  An  isolated  system  is  in  equilibrium  when  the  probabilities  that 
it  will  be  in  various  states  are  constant  over  time.  The  characteristic  time  needed  for  a 
perturbed  system  to  attain  equilibrium  is  called  its  relaxation  time.  In  this  course  we’ll 
always  assume  that  systems  are  always  at,  or  arbitrarily  close  to,  equilibrium.  This  requires 
that  all  changes  happen  slowly  compared  to  the  relaxation  time.  Such  processes  are  called 
quasistatic. 

The  fundamental  postulate  of  statistical  mechanics  is 


An  isolated  system  in  equilibrium  is  equally  likely  to  be  in  any 
state  accessible  to  it. 


The  total  number  of  states  accessible  at  some  energy  is  called  17.  So  the  chance  of  the 
system  being  found  in  any  particular  state  is  I/O. 

Example  1:  3  coins  are  flipped.  What  is  the  chance  that  2  of  them  land  heads  up? 

P(hh .)  =  3C'2  (1/2)2(1/2)1  =  3/8.  Answer  (3-1) 

The  total  number  of  states  of  3  flipped  coins  is  0  =  8.  The  number  of  states  with  2  heads 
is  called  the  degeneracy  of  the  2-heads  state,  which  is  3  in  this  case. 

Example  2:  3  identical  particles,  spin  l/2.  What  is  the  chance  that  2  particles  have 
spin  up?  Now  17  =  4,  corresponding  to  all  down,  1  up,  2  up,  3  up.  There’s  no  degeneracy — 
or  equivalently,  we  might  choose  to  say  that  the  degeneracy  equals  1,  so 

P(2j)  =  l/4.  Answer  (3.2) 

3.1  Density  of  States  for  a  Monatomic  Gas 

What  is  the  number  of  states  17  for  a  gas  of  N  distinguishable  particles  of  the  same  mass? 
The  number  of  states  at  any  particular  energy  E  is  generally  extremely  difficult,  if  not 
impossible,  to  compute.  But  we  can  use  the  fact  that  the  energy  spacing  between  neigh¬ 
bouring  states  is  typically  so  tremendously  small  that  the  state  energies  can  be  treated  as 
a  continuum.  This  is  analogous  to  treating  the  mass  of  a  ruler  as  distributed  continuously 
along  its  length.  The  mass  is  not  really  a  continuum;  it’s  located  in  the  nuclei  of  the 
atoms  that  comprise  the  ruler.  We  cannot  talk  about  the  mass  at  a  point  a  distance  L 
from  one  end.  For  this  reason,  the  concept  of  mass  density  was  invented:  the  mass  density 
is  q(L)  at  a  point  a  distance  L  from  one  end,  and  we  calculate  it  by  averaging  over  many 
nuclei.  We  can  use  this  density  to  calculate  approximately  how  much  mass  is  in  some 
small  length  A L  of  the  ruler:  it  is  AM (L)  ~  g(L)  A L.  We  even  refer  to  an  infinitesimal 
mass  dM  =  g(L)  d L  even  though,  strictly  speaking,  this  has  no  proper  physical  meaning 
for  a  ruler  made  of  atoms. 

Since  discussing  the  number  of  states  17(E)  at  some  energy  E  is  often  problematic, 
we  treat  17(E)  like  the  mass  of  the  ruler  at  some  point.  That  is,  just  as  we  modelled  the 
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ruler  as  a  continuum  of  mass  and  worked  only  with  the  mass  density  at  a  point,  likewise 
we  approximate  the  spread  of  states  as  a  continuum,  and  define  a  density  of  states.  To 
do  this,  write  the  total  number  of  states  in  the  energy  range  0  — ►  E  as  £ltot(E),  like  the 
total  mass  of  a  ruler  of  length  L.  With  a  suitable  “coarse  graining”,  £ltot(E)  is  related  to 
the  density  of  states  g(E)  by  dHtot  =  g(E)  d E.  The  number  of  states  in  a  small  energy 
interval  will  then  be  g(E)  times  the  width  of  the  interval.  If  the  typical  spacing  of  energy 
levels  around  E  is  A E  (in  practice  an  incredibly  tiny  number),  then  we  might  write  the 
number  of  states  “at”  E  as 


n(E)~Antot~g(E)AE.  (3.3) 

Hence  we  can  calculate  g(E)  by  first  finding  Qtot(E)  and  then  writing 


(3.4) 


So  rather  than  try  to  calculate  Q(E)  for  a  gas  of  N  distinguishable  particles,  we’ll  instead 
calculate  g(E)  via  Qtot(E).  First,  consider  the  number  of  states  accessible  to  one  particle 
of  energy  E  and  momentum  p.  In  phase  space  (position-momentum  space),  the  particle 
has  a  range  [x]  available  to  it  in  the  x  direction  (similarly  [y] ,  [z] ) ,  and  a  range  \px]  available 
to  it  in  the  px  direction  (similarly  [py\,  \pz\).  Quantum  mechanically,  the  particle’s  position 
and  momentum  (in  the  x  direction)  are  defined  only  up  to  [x],  \px\  with  at  best  [x][px]  «  h. 
so  we  partition  the  phase  space  into  cells  where  each  cell  defines  one  accessible  state.  For 
example,  for  one-dinrensional  motion  the  xp-space  is  divided  into  cells  of  area  [x]  \px]  =  h. 

Does  it  make  sense  to  define  the  volume  of  one  cell  of  phase  space  by  dividing  by  one  factor  of  h  for 
each  dimension?  Perhaps  we  should  use  h/2  instead?  In  fact,  it  doesn’t  matter  whether  we  use  h 
or  h/2  or,  for  that  matter,  lOO/i.  All  that  matters  is  that  we  use  a  constant  with  the  dimensions  of 
position  x  momentum,  and  h  is  a  convenient  choice.  We’ll  explain  why  at  the  end  of  Section  5.2. 


9(E)  = 


d^tot 

dE 


For  one  particle  confined  in  a  box  of  volume  V,  the  “small”  number  of  cells  in  phase  space 
around  energy  E  is  then 


dST 


tot 


[x][Px]  [y] \Py\  [z]\pz]  =  v[px][py][pzh 

h 


h 


h 


h 


(3.5) 


because  [x] ,  [y] ,  [z]  range  over  the  whole  dimensions  of  the  box,  so  their  product  is  V. 
The  \jjx] ,  [py ],  [pz]  are  very  small  ranges  around  the  nominal  values  of  px,py,pz.  The  total 
number  of  cells  in  phase  space  for  this  particle  for  all  energies  0  — ►  E  is  fItot: 


V 

Qtot(E)  ~  — g  x  a  volume  of  momentum  space  in  3  dimensions. 

h 

For  N  particles  with  total  energy  E.  the  total  number  of  states  up  to  energy  E  is 


N 


v 

f2tot (E)  =  x  a  volume  of  momentum  space  in  3 N  dimensions. 


h 


(3.6) 


(3.7) 


2  2 

Label  the  particles’  momenta  Pix,Piy,  ■  ■  ■  ,Pnz-  Thenpla,+-  •  • +Pnz  =  2 mE  where  m  is  the 
particle  mass.  So  we  require  the  volume  of  a  sphere  in  3 N  dimensions  with  radius  \j2rnE. 
The  volume  of  a  sphere  of  radius  R  in  n  dimensions  is 

7rn/2  Rn 

v°iume = ■  (3’s> 
(Try  this  formula  for  n  =  1,2,3,  using  (1/2)!  =  i/7t/2.) 
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On  a  side  note,  the  factorial  is  actually  defined  as  a  function  over  the  complex  numbers.  An 
alternative  notation  for  it  is  11(2:)  =  z!,  which  lends  itself  to  writing  its  derivative  IT (2).  You  will 
usually  see  the  equivalent  notation  r(z  +  1)  =  z!  in  textbooks.  Why  is  there  a  “  +  1”  there?  There 
is  no  good  reason  for  this,  and  I  think  we  would  all  be  better  off  by  dropping  it  once  and  for  all. 
The  gamma  function  with  its  unnecessary  +  1  was  even  declared  outdated  in  favour  of  the  factorial 
by  Sir  M.J.  Lighthill  a  half  century  ago  in  his  classic  text  on  Fourier  analysis  [2].  If  you  do  use  the 
gamma  function  you  will  find  yourself  writing  tedious  expressions  like  T(a  +  1)  =  T(6  +  1)  T(c  +  1), 
and  you  will  forever  need  to  remind  yourself  of  the  +  1  when  calculating  simple  things  like  T(5). 
Strike  a  blow  for  notational  simplicity  and  write  11(2)  when  you  need  to  use  function  notation  for 
the  factorial. 


Setting  n  =  3IV  and  R  =  y/2 mE  in  (3.8),  then  writing  7  =  3N/2,  (3.7)  becomes 

VN  tt7(2 mE)1 


Q 


tot(£)  ^27 


7! 


(3.9) 


This  is  the  total  number  of  states  up  to  energy  E.  We  get  at  £l(E),  the  number  of  states 
at  energy  E,  via  the  density  of  states  g(E)  using  (3.4): 


9(E)  =  niot  (E)  =  727 

With  N  large,  use  Stirling’s  rule: 

9(E) 


VN  (2vr m)7  ! 


/r27 


V  (27rm)7e77£,7_i 


7! 


h21  77+1/2V^T 


V 


7  E1 


n  (  2mn  e  E 


V 


,2 

n  7 


(3.10) 


(3.11) 


The  main  result  here  is  that  g(E)  oc  VN (g:3Ar/2_1  ~  yN e3N/2  w)ien  jy  )s  large.  If  we  do 
require  the  number  of  states  “at”  a  particular  energy  E,  we  can  estimate  it  using  (3.3) 
with  a  suitable  choice  of  A E.  But  it  turns  out  that  we  won’t  have  to  do  this. 

Do  note  that  a  consequence  of  the  above  continuum  approximation  is  that  the  number 
of  states  Q(E)  and  their  density  g(E)  are  often  treated  somewhat  interchangeably  in 
statistical  mechanics.  We’ll  explain  why  this  is  done  at  the  end  of  Section  5.2. 


27 

Example  1.  We  have  a  cubic  room  of  side  5  nr  with  N  =  10“  distinguishable  particles 
at  300 K,  each  with  mass  equal  to  the  average  mass  of  an  air  molecule  (4.8  x  10  26  kg). 
What  is  the  density  of  states  g(E)l  Use  the  Equipartition  Theorem  (proved  later)  to  set 
a  value  for  E  of  N3kT/2. 

Notational  device:  write  a  -  =  a  x  106  . 


logio  9(E)  =  1.5=  log10 


'25  x  2vr  x  4.8=  x  2.7  x  3/2  x  1.38=  x  300  x  1027' 


(6.63  =  )2  x  1.5  = 


27 


o  c 

~  3.5  =  . 


(3.12) 


g(E)  «  103"5x10  states/joule.  Answer  (3.13) 

How  big  is  this  number?  If  we  just  settle  for  writing  it  out  as  approximately  1  followed  by 
a  string  of  0s,  with  each  0  being  1  cm  across,  then  the  length  of  this  string  will  be  about  37 
thousand  million  light  years,  or  several  times  the  extent  of  the  observable  universe.  That’s 
not  how  big  the  number  is;  rather,  that’s  just  how  big  its  decimal  representation  is.  The 
number  itself  is  stupendously  bigger. 
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Identical  particles 

To  do  the  same  analysis  for  identical  particles,  realise  that  if  the  number  of  states  is  much 
larger  than  the  number  of  particles  (which  is  very  true  above),  the  above  calculation  will 
over-count  by  a  factor  of  N\.  So  divide  the  result  in  (3.13)  by  10  !  (that’s  a  factorial,  and 
is  left  as  an  exercise)  to  obtain 

27 

g(E)  «  io8'4xl°  states/joule.  Answer  (3.14) 

If  the  number  of  states  is  not  3>  the  number  of  particles,  we  will  need  the  approach  used 
in  quantum  statistics  later  in  the  course. 


3.2  Density  of  States  for  More  Complicated  Structures 

Consider  the  variables  that  quantify  a  substance’s  energy.  There  is  a  background  potential 
energy  u0  along  with  various  potential  and  kinetic  energies,  so  the  energy  of  one  particle 
is  e  for  the  following  regimes: 


Monatomic  liquid  and  gas: 


2  2 

Px  Pz 

£“Wo+2^  +  "'  +  2^- 


(3.15) 


In  liquids,  u0  is  complicated  as  molecular  configurations  fluctuate  rapidly.  In  gases, 

u0  ~  0. 

Solid  crystal  lattice:  particles  are  like  harmonic  oscillators,  with  x,  y,  z  measuring  their 
displacement  from  equilibrium: 


,2  ,22  2 

kx  kz  p~  p~ 

£  =  H - 1 - 1 - h  —  H - b  — 

0  2  2  2 m  2  m 


(3.16) 


Gas  of  complex  molecules:  with  angular  momenta  L1,...,L3  about  principal  axes, 
paired  with  moments  of  inertia  /lr.  ..,/3,  along  with  reduced  mass  g,  vibration 
frequency  u>,  vibrational  separation  r: 

2  t-2  1  1 

I  I  I  I  ,  *2  |  X  2  2  /q  -i  ry\ 

£  =  “”  +  2^  +  '"  +  27;  +  "'+2f‘r  +2'“r  ■  (3'17) 

" - V - '  V— U " - - - ' 

translation  rotation  vibration 


Each  of  these  variables  that  contributes  to  the  energy  via  a  square  is  called  a  degree 
of  freedom.  (Why  are  the  two  terms  for  vibration  above  called  two  degrees  of  freedom 
when  one  cannot  be  changed  without  also  changing  the  other?  The  name  is  something  of 
a  misnomer;  a  degree  of  freedom  is  simply  defined  as  a  term  that  contributes  to  the  energy 
via  a  square.  Both  of  those  terms  do  that.) 

If  we  re-derive  g(E)  for  these  more  complex  structures,  the  same  general  arguments 
apply,  but  with  some  modifications  as  follows.  Each  extra  degree  of  freedom  contributes 
an  extra  dimension  per  particle  to  the  sphere  in  phase  space.  E.g.,  if  the  number  of  degrees 
of  freedom  per  particle  is  v  =  5  (for  a  diatomic  molecule,  since  experiments  show  that  this 
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won’t  spin  around  its  main  axis),  the  “3-/V/2”  above  is  replaced  by  51V/2  and  the  various 
coefficients  change.  Also,  correcting  for  identical  particles  (when  there  are  far  more  states 
than  particles,  which  is  the  case  here)  means  dividing  by  AH,  or  approximately  by  NN . 
In  that  case,  for  an  ideal  gas  we  can  write  expressions  for  the  number  of  accessible  states 
for  distinguishable  and  for  identical  particles: 


/  r  \  vN  12 

n  tAV  /  x 

^dist  ocl/  (  — 


n 


N 


^identOcI  )  I 


E 


vN/2 


(3.18) 


The  particles  of  a  solid  are  not  free  to  move  about  their  container  like  a  gas,  so  there  is 
no  spatial  volume  term.  Also,  only  the  thermal  energy  e  —  u0  contributes  to  the  number 
of  accessible  states.  So  for  a  solid  we  write 


^dist  c*- 


^E-Nu^N/2 


(3.19) 


The  particles  of  a  solid  are  certainly  distinguishable  by  their  locations  at  the  various  lattice 
sites,  so  there  is  no  fluent  to  be  considered. 


4  Zeroth  and  First  Laws  of  Thermodynamics 

When  two  systems  interact,  energy  can  be  transferred  in  three  ways: 

(a)  transfer  of  heat:  “thermal”  (conduction,  convection,  radiation), 

(b)  doing  work:  “mechanical”  (pressure,  fields), 

(c)  letting  particles  move:  “diffusive”  (permeable  membranes). 

Zeroth  Law  of  Thermodynamics 

If  two  systems  are  in  thermal/mechanical/diffusive  equilibrium  with  a 
third  system,  then  they’re  in  thermal/mechanical/diffusive  equilibrium 
with  each  other. 


4.1  Preparing  for  the  First  Law  of  Thermodynamics 


The  First  Law  of  Thermodynamics  is  a  statement  of  the  conservation  of  energy.  It  is 
usually — perhaps  always — expressed  in  terms  of  infinitesimals,  so  we  will  first  make  some 
comments  about  these. 

Infinitesimal  quantities,  also  called  differentials,  are  used  extensively  in  statistical  me¬ 
chanics.  What  does  a  quantity  like  dt  mean?  Consider  deriving  a  particle’s  velocity  v{t) 
from  its  position  s{t)  in  one  dimension.  We  might  Taylor-expand  s{t  +  At)  to  write 


v(t)  =  lim 
v  '  At^o 


=  lim 

At^O 


s(t  +  At)  —  s(t) 

At 

s(t )  +  s'(t )  At  +  ^s''(t)  At 2  +  •  •  • 


-  s(t) 


At 
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=  iS  o s'^  +  k\s"^  At+  ■■■ 

=  s\t).  (4.1) 

But  we  could  just  as  well  write  all  of  this  as 

_  S(t  +  dt)  ~  S(t)  _  «(*)  +  SS)  dt  -  S(t)  _  Ju\  l  A  o^ 

y[t)-  dt  -  dt  -S{t).  (4.2) 

This  last  expression  is  an  economical  and  elegant  way  of  writing  the  previous  one.  By 
writing  “dt”,  we  really  mean  “At  +  0(At  )”  along  with  a  statement  of  an  eventual  division 
by  At  and  a  limit  being  taken  as  At  — >  0.  So  when  speaking  of  an  infinitesimal,  or  an 
“infinitesimally  small  quantity”,  we  are  really  referring  to  the  end  result  of  a  limit  process 
applied  to  the  non-infinitesimal  At. 

This  sort  of  idea  also  applies  to  the  delta  function  used  widely  in  Fourier  analysis.  The  delta 
function  <5(x)  is  usually  defined  as  an  infinitely  tall  spike  at  x  =  0  and  zero  elsewhere,  with 
/“  <S(s)  dz  =  1.  Any  expression  involving  this  function  can  be  treated  as  the  limit  of  a  se¬ 
quence  of  similar  expressions  that  each  replace  the  delta  by  a  bell-shaped  function,  where  these 
bell-shaped  functions  become  increasingly  narrower  and  higher  in  the  limit. 

Treating  delta  functions  and  infinitesimals  as  tied  to  a  limit  process  gives  them  a  firm  founda¬ 
tion,  although  with  delta  functions  we  need  always  to  ask  whether  it’s  valid  to  swap  the  relevant 
manipulation  and  the  limit  process.  You  will  find  the  occasional  book  stating  that  infinitesimals 
need  advanced  ideas  of  differential  geometry  to  give  them  substance.  If  you  do  investigate  further 
to  analyse  what  this  might  mean,  I  suggest  that  you’ll  only  find  notation  that  became  briefly 
fashionable  some  decades  ago  but  never  went  anywhere,  presumably  because  what  it  was  designed 
to  do  was  able  to  be  done  more  simply  in  other  ways.  The  kernel  of  what  infinitesimals  are  all 
about  is  contained  in  (4.1)  and  (4.2).  Infinitesimals  as  defined  by  equations  like  these  are  certainly 
used  routinely  in  differential  geometry,  where  they  have  a  natural  and  very  central  role.  So,  rather 
than  say  infinitesimals  need  differential  geometry  to  give  them  meaning,  I  would  choose  to  say 
that  infinitesimals  help  to  give  differential  geometry  meaning. 


Differentials  are  Increases 

Correctly  translating  a  physics  task  into  the  language  of  mathematics  goes  a  long  way 
to  making  it  tractable.  In  particular  for  doing  problems  in  thermodynamics,  we’ll  stress 
the  following  point.  For  any  quantity  /,  the  symbols  A /  and  d /  refer  to  increases  in  /, 
meaning  /final  —  /initiai.  Similarly,  —A/  and  — d /  are  decreases  in  /.  Usually  A /  and  df 
are  called  “changes  in  /”,  but  this  is  not  a  very  useful  phrase  if  we  give  “change”  its  everyday 
meaning  of  the  absolute  value  of  increase  or  decrease.  After  all,  discarding  a  sign  is  not  a 
good  idea!  Likewise,  when  we  write  V/  and  df  in  the  context  of  partial  derivatives,  we  are 
referring  indirectly  to  increases  in  /  as  other  quantities  are  increased.  And  remember  that 
an  increase  can  be  negative — that’s  what  is  meant  by  a  decrease.  The  same  idea  holds  for 
vectors  too:  by  Av  we  mean  ufinal  —  uinitial.  This  is  the  increase  in  v,  although  the  idea 
of  a  vector  increasing  might  not  be  as  intuitive  as  it  is  for  everyday  numbers,  because  the 
length  of  v  needn’t  change  when  v  increases.  But  that’s  okay;  after  all,  who  said  anything 
about  length?  Just  remember  that 

A  =  increase  =  final  —  initial ,  (4.3) 
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If  we  keep  the  correct  language  in  mind,  we  will  have  no  problems  recognising  that  a 
term  like  —  dV  means  a  decrease  in  volume:  when  a  volume  V  gets  smaller,  the  amount 
by  which  it  decreased,  —  dV,  is  positive.  For  example,  when  we  push  a  piston  to  squeeze 
the  air  in  a  cylinder,  we  do  work  equal  to  the  pressure  P  that  we  applied  (always  positive) 
times  the  loss  in  volume  —dV  (again  positive),  or  —  PdV;  thus  the  energy  E  of  the  gas 
increases  by  this  amount:  d E  =  —  PdV.  The  pressure  changes  as  the  volume  decreases, 
which  is  why  we  write  this  using  infinitesimals.  We  could  also  write 

fVf 

A  E=  /  -PdV, 

Jvt 

which  means  neither  more  nor  less  than  the  infinitesimal  expression  (but  takes  more  room 
to  write!).  Keeping  clear  in  our  minds  whether  the  quantities  appearing  in  the  equations 
of  statistical  mechanics  are  increasing  or  decreasing  helps  us  relate  the  mathematics  to  the 
physics.  A  good  example  of  such  care  will  appear  in  Section  9.3. 


(4.4) 


Exact  and  Inexact  Differentials 


A  differential  df  is  called  exact  if  the  state  of  a  system  possesses  a  unique  value  of  /. 

As  an  example,  suppose  two  people  (“1”  and  “2”)  walk  from  Adelaide  to  Melbourne. 
They  follow  different  paths  and  meet  at  some  point  en  route.  This  point  has  a  particular 
height  h  above  sea  level.  The  position  of  each  walker  always  has  a  unique  value  of  h, 
and  so  dh  is  an  exact  differential.  If  we  write  a  walker’s  height  h(a,  ui)  as  a  function  of 
latitude  a  and  longitude  lo,  then  Ah  =  f  dh  is  independent  of  the  path  that  each  took  to 
get  to  their  meeting  point.  Also,  Ah  equals  “final  height  minus  initial  height”. 

Their  meeting  point  has  a  particular  latitude  a  and  longitude  u,  and  so  dcqdcu  are 
also  exact  differentials.  An  exact  differential  like  dh  can  be  written  as  dh  =  Ada  +  B  dcu, 
which  is  entirely  equivalent  to  writing 


dh  dh 

~da  ~  J  ’  aJ 


(4.5) 


Contrast  the  walkers’  height  h  at  their  meeting  point  by  the  distances  they  each  have 
walked.  Their  common  position  doesn’t  possess  a  unique  value  of  a  variable  called  dis¬ 
tance  s.  Certainly  s  is  well  defined  for  each  walker,  but  in  general  they  have  walked 
different  distances  s1,s2. 

Now  suppose  each  walker  takes  an  infinitesimal  step,  increasing  Si  and  s2  by  infinitesi¬ 
mal  amounts.  These  steps  are  certainly  well  defined,  but  are  not  exact  differentials  because 
the  state  called  “position”  does  not  have  a  unique  s.  The  steps  are  written  ds1,ds2,  and 
are  called  inexact  differentials.  We  can  use  a  generic  step  d-s  to  write  the  total  distances 
covered  by  each  walker  on  arriving  in  Melbourne  as 


sq  =  /  ds ,  and  s2  =  /  ds . 


path  1 


path  2 


(4.6) 


Notice  that  we  might  not  be  inclined  to  write  their  total  distances  covered  as  As1,As2. 
That’s  because  As  means  “final  s  minus  initial  s”,  but  there  is  no  variable  s  that  is  a 
function  of  position.  Perhaps  we  could  put  a  bar  through  the  A  symbol,  but  this  is  not 
something  that  anyone  does. 
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Another  example  of  an  inexact  differential  is  the  small  area  element  used  to  discuss 
Gauss’s  theorem  in  electromagnetism.  This  is  usually  written  dA,  but  it’s  not  as  if  we 
have  an  area  A  that’s  increasing  by  dA.  This  area  element  is  more  properly  written  dA. 
Likewise,  the  force  due  to  air  pressure  that  acts  on  a  small  surface  of  area  dA  could  be 
written  dF  but  it’s  more  usually  written  di7.  Perhaps  the  use  of  d  is  confined  to  statistical 
mechanics,  but  there’s  no  real  reason  why  this  should  be  so. 

The  Inexact  Differential  “Heat  in”,  dQ 

Suppose  we  are  given  a  container  of  hot  gas.  The  subject  of  thermodynamics  deals  with 
the  processes  this  gas  might’ve  undergone  to  bring  it  to  its  present  state.  It  might’ve  been 
heated  over  a  stove  (“thermal”  in  the  Zeroth  Law),  by  doing  work  on  it  (“mechanical”), 
by  changing  its  chemical  environment  (“diffusive”),  or  some  combination  of  the  three. 
Knowing  nothing  of  the  gas’s  history,  we  cannot  generally  ascertain  how  it  was  heated. 
Heating  it  over  a  stove  gives  it  thermal  energy  Q,  but  doing  work  on  it  gives  it  mechanical 
energy  f  —PdV,  which  manifests  as  the  same  heat.  Both  of  these  actions  lead  to  the 
same  final  state,  so  we  cannot  say  the  gas  has  a  unique  “heat”  Q  associated  with  it.  We 
can,  however,  talk  about  a  small  amount  of  heat  being  put  into  the  gas,  and  this  must 
then  be  an  inexact  differential,  written  dQ.  And  just  as  in  the  discussion  immediately 
following  (4.6)  of  the  distance  s  covered  by  a  walker,  we  will  always  write  Q,  never  A Q, 
for  a  large  amount  of  heat  put  into  a  system. 

Although  we  have  written  the  mechanical  work  done  on  the  gas  as  —  PdV,  it’s  some¬ 
times  introduced  by  writing  it  as  dW .  Again  the  bar  is  necessary  because  the  state  does 
not  have  a  unique  parameter  called  work  W  associated  with  it;  any  work  we  do  on  the 
system  must  be  an  inexact  differential.  But  notice  that  this  inexact  differential  dW  can 
actually  be  written  as  an  exact  differential  —  -PdV,  because  V  is  certainly  a  state  variable! 
We’ll  show  later  that  the  same  can  be  done  for  dQ.  This  rewriting  of  dQ  in  terms  of  a  new 
state  variable  was  a  key  discovery  of  statistical  mechanics,  and  was  the  central  idea  that 
allowed  thermodynamics  to  be  analysed  and  extended  using  statistical  mechanics  ideas. 

First  Law  of  Thermodynamics  (Provisional  Form) 


The  infinitesimal  increase  in  internal  energy  of  a  system  is  given  by  thermal, 
mechanical,  and  diffusive  contributions  as  follows: 

dF  =  dQ  -  PdV+  [I  dA .  (4.7) 

dQ  =  heat  put  in  by  e.g.  a  stove. 

— PdV  =  pressure  x  loss  in  volume  =  work  done  on  system. 

fj,  d N  =  chemical  potential  x  increase  in  particle  number  =  energy  brought 
in  by  new  particles  that  isn’t  related  to  heat  transfer  or  work.  Due  to  new 
environment  created  by  incoming  particles.  (Think  of  adding  water  to  a 
concentrated  acid:  it  heats  up  dangerously.) 


The  term  —  PdV  is  just  one  example  of  work  being  done  on  the  system.  Others  exist, 
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such  as  an  electric  field  which  can  change  dipole  moments,  giving  a  term  —E-dp.  We  will 
write  —  PdV  to  represent  all  such  terms. 

Our  primary  goal  will  be  to  rewrite  the  dQ  in  the  First  Law  as  an  exact  differential, 
since  the  relevant  quantities  can  then  be  used  to  quantify  the  state  of  a  system.  We’ll 
take  V  as  one  of  these  quantities  from  the  outset  (so  won’t  stop  to  write  dW),  and  will 
see  later  how  to  replace  dQ  by  something  else. 


4.2  Partial  Derivatives  and  Variables  Held  Constant 


A  good  understanding  of  partial  derivatives  is  useful  in  statistical  mechanics.  Here  are 
some  points  worth  noting. 

Consider  a  function  f(x,y,z).  When  writing 


df{x,y,z )  f  df\ 

dx  01  \dx)y 

we  mean  df  /dx  at  constant  y  and  2.  Thus 

f  dx\  1 

\df)y,z  ( df/dx)VtZ  ' 


(4.8) 


(4.9) 


However,  normally  when  we  swap  the  roles  of  e.g.  /  and  x,  the  set  of  variables  that  are 
being  held  fixed  actually  changes,  and  so  the  simple  reciprocation  of  (4.9)  doesn’t  apply. 
A  more  familiar  example  relates  polar  coordinates  to  cartesians.  Begin  with 


x  =  r  cos  9  ,  y  =  r  sin  9  . 


(4.10) 


In  such  an  arena  when  we  write  something  like  dxjdr,  we  mean  (dx/ dr)  q\  that  is,  differ¬ 
entiate  with  respect  to  one  variable  (r)  holding  all  others  of  its  family  (9)  constant.  With 
this  convention,  the  set  of  partial  derivatives  of  one  set  of  coordinates  with  respect  to  the 
other  is  usually  written  as  the  elements  of  a  matrix,  called  a  jacobian  matrix: 


dx 

dx  " 

dr 

d9 

cos  9 

— r  sin  9 

dy 

dy 

sin  9 

r  cos  9 

.  dr 

d9  . 

(4.11) 


There  are  two  jacobian  matrices:  one  has  the  partial  derivatives  of  cartesians  with  respect 
to  polars,  and  the  other  has  the  partial  derivatives  of  polars  with  respect  to  cartesians. 
Now  watch  carefully:  by  multiplying  these  two  matrices,  you  should  be  able  to  see  that 
the  following  relationship  holds: 


dr 

dx 

dr 

dy 

dx 

dr 

dx  " 

d9 

-1 

cos  9 

sin  9' 

d9 

d9 

dy 

dy 

—  sin  9 

cos  9 

.  dx 

dy  . 

.  dr 

d9  . 

r 

r  - 

(4.12) 
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This  is  an  important  relationship  because  it  enables  us  to  invert  partial  derivatives  when 
the  set  of  variables  being  held  constant  switches  from  one  set  of  coordinates  to  the  other. 
For  example,  comparing  the  “1,1”  elements  of  (4.11)  and  (4.12)  shows  that 


dx 

dr 


cos  8 . 


(4.13) 


With  the  above  convention  of  omitting  the  constant  variables  in  mind,  this  is  normally 
just  written 


dx  dr 

—  =  —  =  cos  . 
dr  ox 


(4.14) 


This  might  at  first  look  a  little  odd,  until  we  realise  that  each  derivative  holds  a  different 
variable  constant,  and  so  simple  reciprocation  cannot  be  used. 


Example  1:  Show  that  ( dx/dr)e  = 

Start  with  r  =  x/cos  8,  so  that 

/  dr  \  1  1 

\dx)e  cos  8  ( dx/dr)e  = 

Example  2:  What  is  ( d6/dr)yl 


The  easiest  approach  is  to  differentiate  both  sides  of  y  =  r  sin  8  with  respect  to  r,  holding 
y  constant,  to  get  0  =  sin  8  +  r  cos  8  ( d8/dr)y .  Or,  for  a  slight  variation,  calculate  ( dr/dd)y 
in  the  same  way  and  form  the  reciprocal.  Here  is  a  third  way,  which  might  give  you  more 
insight.  Draw  two  infinitesimally  separated  points  at  constant  y.  One  has  polar  coor¬ 
dinates  (r,  8)\  the  other  has  (r  +  dr,  8  +  d6*).  Noting  that  we  need  keep  only  the  lowest 
powers  necessary  of  the  infinitesimals,  write 


y  =  r  sin  8  =  (r  +  dr)  sin(0  +  d$) 

=  (r  +  dr)  (sin  8  +  cos  8  d0) 

=  r  sin  6  +  sin  8  dr  +  r  cos  8  d#  . 


So  at  constant  y  we  have 


sin  8  dr 


rcos#d$,  in  which  case 
—  sin  8 

=  - -  .  Answer 

r  cos  8  - 


(4.16) 


(4.17) 


Note  that  although  we  seemed  to  work  to  first  order  only,  the  answer  is  exact.  Can  you 
see  why?  If  not,  study  (4.1)  and  (4.2)  in  the  more  familiar  language  of  position  s(t)  and 
velocity  v(t )  in  one  dimension,  and  remember  that  velocity  is  a  first-order  increase  in 
position  with  respect  to  time. 
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5  Accessible  States  for  Interacting  Systems 


We  wish  to  focus  on  the  elusive  notion  of  heat  transfer.  So  consider  2  systems  interacting 
thermally,  but  not  mechanically  or  diffusively.  They’re  isolated,  so  their  total  energy  =  E, 
a  constant.  System  1  has  Ny  particles,  each  with  V\  d.o.f.  and  total  energy  Ex .  Similarly 
for  system  2.  What  is  the  total  number  of  accessible  states  12  as  a  function  of  Ey? 

N1,v1  N2,u2 

i  Ei  i  I  Eo  =  E  —  E1  ! 

system  1  system  2 

Define  7l  =  VyNy/2  and  72  =  ^2A2 /2,  so  that  the  results  of  Section  3  give 

Sly  oc  Ej 1  ,  D2  oc  El2  .  (5.1) 


Then 

Sl(Ey)  =  SlySl2  <x.  E}1  (E  -  Ey)12  ,  0  ^  Ey  ^  E  .  (5.2) 

Consider  plotting  Sl(Ey)-vs-Ey.  The  stationary  points  occur  when  Sl'{Ey)  =  0: 

n\Ey)  =  E 71"1  (E  -  Ey y*-1  [7lE  -  (7l  +  72)^]  =  0  .  (5.3) 

So  stationary  points  occur  at  Ey  =  0  and  E  (minima),  and  Ey  =  Ey  =  jyE/^y  +72) 
(maximum) . 

How  wide  is  the  peak  at  E  =  Ey?  A  useful  measure  is  a  such  that 

Sl(Ey  +  aEy)  =  ±Sl(Ey)i  (5.4) 


so  that  2 aEy  is  approximately  the  “full  width  at  half  maximum”  (commonly  known  as  the 
FWHM).  (This  is  only  approximate,  as  the  peak  isn’t  necessarily  symmetric.)  In  (5.2) 
we  see  that  12  is  a  product  and  involves  powers,  so  its  logarithm  turns  out  to  be  simpler 
to  work  with;  also,  a  logarithm  will  lead  to  a  better  approximation  of  the  peak  when  we 
Taylor-expand  shortly,  because  the  logarithm  of  a  strongly  peaked  function  is  not  strongly 
peaked  and  so  needs  fewer  Taylor  terms  to  describe  it.  So  we  introduce  a  new  symbol 
which  will  play  a  central  role  in  statistical  mechanics,  called  the  statistical  entropy : 


Use  this  to  write  (5.2)  as 


a(Ey) 
SO  <j'(Ey) 

and  a" {Ey) 


a  =  In  12  . 

constant  +  7l  In  E 

7i 

72 

Ey  ' 

E-Ey' 

-7i 

72 

El  ( E-Eyf 


(5.5) 


(5.6) 


[We  can’t  “really”  take  the  logarithm  of  a  dimensioned  number  such  as  energy;  but  the 
constant  of  proportionality  in  (5.2)  effectively  introduces  a  scaling  factor  for  the  units  that 
does  allow  us  to  take  a  log.  But  this  constant  has  no  effect  on  the  physical  arguments 
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and  is  perhaps  a  little  tedious  to  include  everywhere.  See  the  comment  just  after  (9.2).] 
Equation  (5.4)  becomes 

a(E1  +  aE^  =  —  In  2  +  u(E1) ,  (5-7) 

which  Taylor-expands  to 

_ ^rO  _  _  202 

aE1  +  a "(E-j)  — ~  —  In  2  .  (5-8) 


A  little  work  gives 


a 


I  2  72  In  2 
7i(7i+72) 


(5.9) 


When  7X  =  72  ~  1024,  we  get  a  —  10  12,  and  so  the  FWHM  is  about  2  x  10-12  E1.  This 
is  tiny  compared  with  E±,  so  the  thermal  interaction  means  that  systems  1  and  2  are 
extremely  likely  to  have  energies 


7 iE  E 

7i  +  72  uiN\  +  ^2^2 


respectively. 


p  _  F  p  _  72 E  _  u2N2  E 

jC/o  =  JZ/  —  iZ/i  —  -  —  - 

7i  +  72  ^i-^i  +  ^2-^2 


(5.10) 


Fluctuations  By  what  factor  /  does  the  number  of  accessible  states  O  drop  if  £'1  should 
exceed  F1  by  one  part  per  million?  That  is,  calculate 


»(^i) 

^((l  +  lO'6)^)  ‘ 


(5.11) 


Taylor-expand  In  /: 


—  12  2 

In  /  =  <7(^0  -  <j(F1  +  lO-6^)  ~  -o'iEi)  10  -  —  ~  1012  . 


(5.12) 


/  ~  e10"  ~  io°-4343xl°12  =  io434-300-000-000  .  Answer  (5.13) 

This  is  a  huge  drop.  So,  with  the  system  equally  likely  to  be  in  any  of  its  accessible  states 
(by  postulate) ,  then  the  chance  of  a  1  ppm  fluctuation  away  from  energies  E1 ,  E2  is  so 
minute  that  we  can  discount  it  from  ever  happening.  (We  should  really  do  an  integral 
here  to  consider  a  fluctuation  of  at  least  1  ppm,  but  the  above  calculation  serves  to  give  a 
good  idea  of  the  numbers  involved.) 


5.1  Defining  Temperature,  and  the  Equipartition 
Theorem 

Prior  to  the  advent  of  statistical  mechanics,  the  concept  of  temperature  was  already  known 
in  a  heuristic  way  from  thermodynamics.  An  early  success  of  statistical  mechanics  was  its 
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precise  definition  of  temperature  using  the  above  idea  of  two  thermally  interacting  systems. 
At  thermal  equilibrium  when  the  energy  has  distributed  itself  as  E1,E2,  we  define  the  two 
systems  to  have  equal  values  of  temperature  T.  This  is  done  by  our  noticing  in  (5.10)  that 
the  average  energy  per  particle  per  degree  of  freedom  is  the  same  for  both  systems,  so  that 
the  common  value  can  be  used  to  define  their  temperature: 

E\  _  E-2  _  E  _  fcT 
is  [  A  i  v2  N2  is  ]  A  |  A  v 2  Ar2  2 

Here  k  is  Boltzmann’s  constant,  inserted  to  allow  this  statistical  definition  of  temperature 
to  be  equated  with  the  everyday  thermodynamic  idea  of  temperature  (as  we’ll  soon  see; 
so  we’ll  assume  from  now  on  that  temperature  is  a  positive  quantity).  The  factor  of  2 
ensures  compatibility  with  other  uses  of  Boltzmann’s  constant.  In  practice  we  must  provide 
something  extra  to  disentangle  temperature  from  k.  This  is  done  by  setting  T  =  273.16  K 
at  the  triple  point  of  water  (~  0.01°C). 


Note  that  the  SI  unit  of  the  Kelvin  temperature  scale  is  a  “kelvin”,  not  a  “Kelvin  degree”.  A 
temperature  of  100  K  is  vocalised  “one  hundred  kelvins” — not  “one  hundred  degrees  Kelvin”,  nor 
“one  hundred  Kelvin”.  In  common  with  all  SI  units,  “kelvin”  is  written  with  a  lowercase  k  but, 
being  someone’s  name,  its  short  form  uses  an  uppercase  K. 


The  two  systems  can  be  intermixed.  E.g.,  system  1  might  refer  to  translation  (zq  =  3) 
of  all  the  (diatomic)  molecules  present,  and  system  2  might  refer  to  their  rotation  (is2  =  2). 
So  (5.14)  says  that  the  translational  degrees  of  freedom  possess  energy  3kT/2  per  particle, 
and  the  rotational  degrees  of  freedom  possess  energy  2 kT /2  per  molecule.  This  constitutes 
the  Equipartition  Theorem. 


Equipartition  Theorem 


If  the  equilibrium  distribution  is 

-  the  most  probable  distribution  consistent  with  constant  total  energy 
and  constant  particle  number,  and 

-  there  is  no  restriction  on  the  number  of  particles  in  any  one  state,  and 

-  thermal  energy  E  varies  continuously  with  a  coordinate  u,  and 

-  E  depends  on  u  , 

then  the  energy  associated  with  this  coordinate  is  kT / 2. 


5.2  Entropy  and  the  Second  Law  of  Thermodynamics 


Equation  (3.10)  implies  that  when  N  is  large,  H  =  f(V,N)EuN^2 
That  means 

uN 

a(E,  V,  N)  =  In  O  =  In  f(V,  N)  -\ — —  In  E . 


for  some  function  /. 


(5.15) 


Thus 


vN  Equipartition  IsN 

2 E  2  isNkT/2 


1 

kT  ' 


(5.16) 
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Define  the  thermodynamic  entropy  (usually  just  called  entropy)  of  a  general  system  to  be 
S  =  ka,  i.e. 

(5.17) 


With  this  definition,  (5.16)  becomes 

MS)„  -  DSL 

This  is  often  taken  as  the  definition  of  T. 

Entropy  is  additive ,  since  for  any  two  systems  1  and  2  with  entropies  Si  =  k  In  and 
S2  =  ATnfl2,  the  entropy  of  the  combined  system  before  they  interact  is  A:ln(D1D2)  = 
k  In  Q1  +  k  In  Vt2  =  Si  +  S2  ■ 

Since  a  system  composed  of  two  subsystems  having  attained  equilibrium  is  overwhelm¬ 
ingly  likely  to  be  found  in  a  state  for  which  D,  or  S,  is  maximised,  we  can  say  that 

When  two  systems  interact,  the  entropy  of  the  combined  system  increases 
along  the  path  to  equilibrium. 

This  is  one  statement  of  the  Second  Law  of  Thermodynamics.  There  are  several  others 
that  are  equivalent  on  various  levels. 

The  Use  of  Planck’s  Constant  for  Quantifying  Entropy 

Near  the  start  of  Section  3.1  we  said  that  it  doesn’t  matter  whether  the  volume  of  one  cell 
of  phase  space  is  defined  by  dividing  by  h  for  each  dimension,  or  h/2  or  lOO/i.  The  reason 
is  because  throughout  our  study  of  entropy  in  statistical  mechanics,  it  will  only  ever  be 
an  increase  in  entropy,  AS,  that  has  a  role  in  the  calculations.  Even  when  we  eventually 
write  S  by  itself  in  (8.2),  it  is  still  only  AS  that  ever  matters. 

With  that  in  mind,  consider  how  entropy  relates  to  a  volume  CV  of  phase  space: 

nf  vf/hvN  vf 

AS  =  Sf  -  Si  =  k\ nflf  -  klntti  =  kln-^  =  kln  -A- — w  =  k In  .  5.19 

f  f  VjhvN  V  v 

It’s  apparent  that  we  could  replace  h  by  any  multiple  of  h  and  nothing  would  change  in 
the  last  equation:  AS  would  still  only  be  determined  by  a  ratio  of  phase  space  volumes. 
In  fact,  h  isn’t  needed  at  all.  As  used  in  Section  3.1  it  was  really  only  a  device  giving  us  a 
way  of  specifying  and  counting  states  for  a  continuous  system,  which  is  a  modern  way  of 
approaching  entropy.  An  alternative  approach  might  define  entropy  through  the  idea  of 
phase  space  volume  alone,  but  that  would  divorce  entropy  from  the  idea  of  the  number  of 
accessible  states.  Defining  entropy  via  the  number  of  accessible  states  allows  us  to  build 
an  intuition  about  it,  because  we  can  then  consider  very  simple  discrete  systems  and  count 
their  states  easily. 

This  reasoning  also  explains  why,  as  we  noted  on  page  9,  the  number  of  states  D(E)  and 
their  density  g(E )  are  often  treated  as  interchangeable  in  statistical  mechanics.  Although 
entropy  is  defined  as  the  logarithm  of  fl(E'),  we  usually  only  have  knowledge  of  g(E). 
However,  (3.3)  ties  these  together  by  way  of  some  unspecified  energy  width  A E,  which 
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then  acts  as  a  factor  in  the  mathematics  without  changing  the  physics,  just  like  any  factor 
we  might  choose  to  put  in  front  of  Planck’s  constant.  So  this  A E  is  not  usually  written 
explicitly.  A  slight  complication  is  that  unlike  a  factor  in  front  of  Planck’s  constant, 
A E  has  a  dimension;  but  any  choice  of  units  will  do  because  they’re  all  related  by  scaling 
factors,  which  ultimately  vanish  because  we  only  ever  really  consider  AS.  In  fact,  any 
absence  of  A E  might  be  considered  offset  by  the  fact  that  we  wrote  A  for  A  —  1  in  (3.11). 
However,  (3.11)  used  Stirling’s  rule,  which  itself  is  just  an  approximation!  You  can  see 
that  there  is  some  vagueness  in  the  number  of  factors  of  energy  when  we  are  dealing  with 
large  numbers  of  particles. 


5.3  Heat,  Entropy,  and  the  First  Law  Again 


When  examining  the  energy  of  a  system  as  a  function  of  some  variables,  we  have  axes  V 
(volume,  corresponding  to  mechanical  interactions)  and  N  (particle  number,  corresponding 
to  diffusive  interactions).  But  a  third  axis  is  needed  to  account  for  thermal  interactions: 
the  dQ  term  in  the  First  Law.  That’s  why  we  considered  a  thermal-only  interaction  at 
the  start  of  Section  5.  Heating  the  system  corresponds  to  increasing  the  entropy  S,  which 
means  we  can  choose  to  make  the  third  axis  simply  entropy  since,  unlike  heat,  entropy  is 
a  state  variable  like  V  and  N.  In  that  case 


dE 


aE .  „ 

osdS 


PdV+ndN. 


For  a  quasistatic  process  we  have  T  =  (dE/dS)v N,  producing 


dE  =  TdS  -  PdV+  ndN. 


(5.20) 


(5.21) 


Comparing  this  with  the  provisional  form  of  the  First  Law  (4.7)  allows  us  to  write 
dQ  =  TdS,  irrespective  of  whether  V,N  are  constant  or  not.  Note  that  for  the  inter¬ 
acting  systems  1  and  2  at  the  start  of  Section  5  on  their  way  to  equilibrium,  no  heat  went 
in  from  the  outside.  But  their  combined  entropy  went  up!  So  on  the  way  to  equilibrium, 
dQ  =  0  but  dS  >  0.  We  can’t  write  dQ  =  TdS  for  the  combined  system;  what  would  T  be 
here,  since  it’s  only  defined  at  equilibrium?  But  certainly  at  equilibrium  T  is  defined,  and 
then  dQ  =  T d S  =  0.  In  general,  dQ  =  T dS  is  only  written  for  quasistatic  processes,  since 
these  are  always  arbitrarily  close  to  equilibrium.  Since  we  only  consider  such  processes, 
(5.21)  can  be  considered  as  the  final  form  of  the  First  Law  of  Thermodynamics,  and  we 
will  use  it  extensively. 


Directions  of  Flow  from  the  First  Law 

Consider  systems  1  and  2  interacting  thermally,  mechanically,  and  diffusively: 

dS  =  dSj  +  dS2  .  (5.22) 

Suppose  energy,  volume,  and  particle  number  are  conserved.  Express  d>S2  in  terms  of  dfq 
using  the  First  Law: 

d E2  =  -d£j  ,  so  T2  dS2  -  P2  dV2  +  [i2  dIV2  =  -T,  dS1  +  P1  dVj  -  /q  d , 
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dp  =  — dp 
diV2  =  -dip 


Thus 


so  T2  dS2  +  P2  dp  -  n2  dip  =  -T,  cLS',  +  Px  dp  -  //,  dip .  (5.23) 

~P  d5’j  ,  (P,  -  P2)  dp  ,  (/ia-^d ip 


dS9  = 


+ 


so  that  (5.22)  becomes 


ds  =  i  -  -H  dSi  + 


p  -P9 


+ 


dp  + 


Mg  ~  Mi 
T9 


dip  . 


(5.24) 


(5.25) 


As  the  system  heads  toward  equilibrium,  d5  >  0.  We  have  the  freedom  to  control  how 
much  of  each  interaction  in  (5.25)  occurs.  So  we  require  the  entropy  to  increase  for  each 
interaction  if  that  interaction  occurs  on  its  own.  Hence  consider  each  term  on  the  right 
hand  side  of  (5.25)  separately  to  conclude: 


Thermal: 


d5i  >  °>  so  < 


Tx  <  P2  and  dp  >  0  (dtp  >  0) 


or 


(5.26) 


[  T)  >  P2  and  dp  <  0  (dtp  <  0) 
So  heat  flows  toward  the  region  of  lower  temperature. 


Mechanical: 


P  -P, 


f  P\>  P2  and  dp  >  0 


dp  >  0  ,  so  < 


or 


(5.27) 


^  P1  <  P-2  and  dp  <  0 

So  the  boundary  moves  toward  the  region  of  lower  pressure. 


Diffusive: 


M2  -  Mi 


f  >  Mi  and  dip  >  0 


dip  >  0  ,  so  < 


or 


(5.28) 


[  <  Mi  and  d ip  <  0 

So  particles  flow  toward  the  region  of  lower  chemical  potential. 

At  equilibrium,  as  before,  each  term  =  0  individually.  Hence  the  temperatures,  pres¬ 
sures,  and  chemical  potentials  become  equal. 


A  Note  on  Heat  Flow  Equation  (5.24)  (x  P2)  says 

dQ2  =  —  dtp  +  terms  involving  dp ,  dip  .  (5.29) 

That  is,  the  heat  flowing  into  system  2  will  only  equal  the  heat  flowing  out  of  1  if  the 
interaction  is  purely  thermal. 
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Intensive  and  Extensive  Variables 

T,P,/a  are  intensive:  they  are  only  defined  at  equilibrium,  and  they  don’t  scale  with  the 
system;  they  are  constant  throughout  the  system. 

S,  V,  N  are  extensive:  they  are  defined  even  outside  of  equilibrium,  and  they  scale  linearly 
with  the  system. 

In  the  First  Law  these  variables  occur  in  conjugate  pairs ,  with  each  term  being  an 
intensive  variable  times  an  infinitesimal  increase  in  its  conjugate,  extensive,  partner.  Ex¬ 
perimentally,  these  two  types  of  variable  seem  to  be  sufficient  to  quantify  all  systems  in 
statistical  mechanics. 


5.4  “Deriving”  the  Ideal  Gas  Law 


Consider  rearranging  the  First  Law  to  give 


d  E  PdV  ^d  N 
dS  =  —  +  — - — 


(5.30) 


Thus,  in  particular, 


P  _  fdS\ 

T  ~  \dVjE,N 


(5.31) 


But  we  know  from  (3.18)  that  for  an  ideal  gas,  fl  =  f(E ,  N )  VN  for  some  function  /.  Its 
entropy  is  thus  S  =  k  In  f(E,  N)  +  Nk  In  V.  Substitute  this  into  (5.31)  to  give 


P  _  Nk 
T  ~  ~V~ 


(5.32) 


which  is  the  ideal  gas  law: 

PV  =  NkT . 


(5.33) 


Actually,  we  haven’t  really  derived  the  ideal  gas  law  here.  What  we  have  really  done  is 
shown  that  our  statistical  definition  of  temperature  is  consistent  with  the  thermodynamic 
definition  of  temperature. 

With  the  number  of  moles  n  =  N/N \  (where  N \  =  Avogadro’s  number)  and  the  gas 
constant  R  =  N^k  —  8.314  SI  units,  we  obtain  the  molar  form  of  the  ideal  gas  law: 
PV  =  nRT.  It’s  often  convenient  in  chemical  calculations  to  replace  k  by  R/N^:  the  gas 
constant  R  is  a  conveniently  simple  number,  and  N \  allows  well-known  molar  quantities 
to  be  used.  Good  examples  of  this  are  found  in  Section  9. 


Example:  What  is  the  volume  of  1  kg  of  02  gas  at  1  atmosphere  and  20°C  (101,325  Pa 
and  293.15  K)? 

Since  one  mole  of  02  has  a  mass  of  about  32  g,  we  are  dealing  with  n  =  1000/32  moles 
of  what  is  essentially  an  ideal  gas.  Then  the  required  volume  is 


nRT  _  1000  x  8.314  x  293.15 
P  ~  32  x  101,325 


nr3  =  0.752  nr3 


752 1.  Answer 


(5.34) 
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6  Heat  Capacity 


Define  the  heat  capacity  of  a  substance,  when  heated  while  holding  parameter  A  con¬ 
stant,  as  the  heat  added  divided  by  the  temperature  increase  it  produces:  C A  =  dQ/dT. 
Normally  particle  number  N  is  constant,  so 


CA  dT  =  dQ  =  dE  +  PdV. 

(6.1) 

At  constant  volume,  Cy  dT 

=  d  E,  or 

fdE\ 

(6.2) 

Cv  =  ' 

At  constant  pressure,  for  an 

ideal  gas  we  have 

CpdT  =  dE  + PdV 

=  CvdT+  PdV 
=  Cv  dT  +  nR  dT  . 

(6.3) 

That  means 

Cp  =  Cy  T  nR  =  Cy  -f-  Nk  . 

(6.4) 

Now  define 

molar  heat  capacity  Cmo 1  =  C/n, 

specific  heat  capacity  Csp  =  C/m  (m  =  total  mass  of  the  substance).  (6-5) 


Equation  (6.4)  leads  to  a  useful  expression  in  chemistry: 

Cpo1  =  Cy°l  +  R .  (6.6) 

Given  that  specific  heat  capacity  is  usually  fairly  constant  over  everyday  temperature 
ranges  of  interest,  the  expression  d Q  =  CAdT  =  mC dT  can  be  integrated  to  give  the 
total  heat  energy  Q  that  must  be  absorbed  by  a  mass  m  to  increase  its  temperature  by  AT : 

(6.7) 

The  specific  heat  capacity  is  usually  just  called  specific  heat. 


6.1  The  Adiabatic  Process 

This  is  a  process  for  which  dQ  =  0:  no  heat  is  exchanged  between  the  system  and  its 
environment.  Particle  number  N  is  usually  taken  as  constant,  so  write  d E  +  PdV  =  0. 
Then,  for  an  ideal  gas, 


CydT  +  PdV  =  0  , 


or  Cy 


d{PV) 

nR 


+  PdV  =  0  . 


That  means  Cv  dP  V  +  Cv  PdV  +  ni?  PdV  =  0  , 
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or  CvVdP  +  (Cv  +  nR)PdV=  0, 

so  CvVdP  +  CPPdV=  0.  (6.8) 


Divide  by  PVCv: 


dP  _  -CP  dK 


Define  a  (temperature-dependent)  parameter  7: 


In  that  case 


7  = 


CV 

Ci 


v 


^mol 


c^p 

c^f 


(6.9) 


(6.10) 


In  P  =  —7  In  V  +  const, 
or  P  oc  D-7  , 


so 


PV  1  =  constant. 


This  last  expression  is  used  when  describing  adiabatic  processes  for  ideal  gases  on  a  PV  dia¬ 
gram.  The  adiabatic  process,  together  with  the  isothermal  process  (PV  =  constant),  make 
up  the  Carnot  cycle  that  forms  the  core  description  of  the  thermodynamics  of  engines  and 
refrigerators. 

For  an  ideal  gas,  E  =  vNkT/2,  so  Cy  =  uNk/2.  Then 


Cv  +  Nk_  Nk 
cv  ~~  +  vNk/2 


v  +  2 
v 


(6.11) 


So  measuring  Cp  and  Cy  gives  information  on  the  structure  of  the  gas  molecules: 


v  = 


(6.12) 


Example:  The  entropy  of  water  at  25°C  and  1  atmosphere  is  188.8  JK  1mol_1. 
Water’s  molecular  mass  is  0.018  kg/mol  and  its  specific  heat  is  4186  JK  Xkg  1.  Raise 
its  temperature  to  27° C.  What  is  the  new  molar  entropy?  (This  example  comes  from 
page  183  #40  of  [1].) 

Deal  with  one  mole.  We  require  S,  its  entropy  at  27°C.  When  heating  anything, 
it’s  wise  not  to  bolt  the  lid  down,  so  we  assume  the  above  specific  heat  is  for  constant 
pressure,  although  what  is  being  held  constant  actually  doesn’t  matter  for  the  purpose 
of  the  calculation  (i.e.,  the  number  4186  has  the  process  built  into  it,  and  we  don’t  need 
to  know  how  the  temperature  was  increased).  We  can  only  assume  the  specific  heat  is 
constant  over  temperature.  The  entropy  increase  from  25°C  is 


ts-fu-m- 


r27  C  CdT  _  273.15  +  27 

I 25° C  T  ~  n  273.15  +  25 


(6.13) 


where  C  is  the  heat  capacity  of  1  mole;  this  is  the  heat  capacity  of  0.018  kg,  so  that 
C  =  4186  JK_1kg_1x  0.018  kg.  Then 


S=  188.8  J/K  +  AS 
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=  188.8  +  4186  x  0.018  In - —  J/K 

298.15 

=  189.3  J/K. 

So  the  new  molar  entropy  is  189.3  JK  1mol  1.  Answer 


(6.14) 


Third  Law  of  Thermodynamics 

Experimentally,  there  seems  never  to  be  more  than  one  state  available  ultimately  as  a 
system’s  temperature  — >  0.  This  suggests  the  Third  Law  of  Thermodynamics: 


In  the  limit  of  temperature  going  to  zero,  a  system’s  entropy  goes  to  zero, 
regardless  of  its  makeup  or  the  makeup  of  its  environment. 


This  allows  us  to  write  a  system’s  entropy  at  some  temperature  T  and  constant  parame¬ 
ter^)  A  explicitly  as 

pT  pT  p~1  jrp 

SA(T)  =  SA(T)  -  SA( 0)=/  d SA=  (6.15) 

Jt= o  Jo  1 


7  The  Flow  of  Heat  Energy 


The  current  density  J  (also  known  as  flux  density )  of  heat  energy  is  defined  as  the  vector 
pointing  in  the  direction  of  heat  flow,  whose  length  is  the  energy  crossing  a  perpendicular 
unit  area  in  unit  time  in  that  direction.  Experimentally  it’s  found  to  be  proportional  to 
the  spatial  rate  of  loss  of  T,  or  — VT.  So 

J  =  —k  VT ,  (7.1) 

where  k  >  0  is  the  thermal  conductivity.  The  heat  current  across  an  area  A  is 


/  = 


J-ndA  =  — k 


VT  ndA , 


(7.2) 


where  the  unit  vector  n  is  perpendicular  to  dA.  But  for  any  scalar  function  T,  the  increase 
in  T  along  a  small  step  nd(  in  space  is  dT  =  VT  n  dl.  Then 

c\.T 

in  the  n  direction  =  VT  n  (7.3) 

(called  a  directional  derivative,  and  sometimes  written  dT/dn .)  Equation  (7.2)  becomes 


/  =  —k 


dT 

— -  along  n  dA 
d  t 


=  —k 


dT 


J_  to 


A. 


where  (•)  denotes  the  mean  value  over  the  area  A.  So 

~dT  _L  to  \  _  I 

d£  surface/  ^A  ’ 


(7.4) 


(7.5) 
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which  leads  to 


_L  to  \ 
surface  / 


=  R 


(7.6) 


Here  R  is  the  thermal  resistance ,  analogous  to  electrical  resistance  by  way  of  Ohm’s  Rule, 
which  says  that  an  electric  current  I  experiences  a  drop  in  electric  potential  $  across  a 
resistance  R  of 


-A<f>  =  IR . 


(7.7) 


(The  drop  —AT  is  more  usually  written  V.)  When  connecting  thermal  resistances  in  series 
and  parallel  to  model  heat  flow  through  complex  objects,  we  add  them  in  the  same  way 
as  we  do  electrical  resistances.  The  quantity  RA  =  A£/k  is  called  the  R-f actor  in  the 
building  trade. 

Example:  An  18  m  x  6  m  roof  is  made  of  25  mm  thick  pine  board  with  thermal  con- 
ductivity  k  =  0.11  W  nr  K  ,  covered  with  asphalt  shingles  of  i?-factor  Rf  =  0.0776  K  nr  W 
Neglecting  the  overlap  of  the  shingles,  how  much  heat  is  conducted  through  the  roof  when 
the  inside  temperature  is  21°C  and  the  outside  temperature  is  5°C? 

When  speaking  of  conducted  heat,  we  mean  the  heat  current  I,  measured  in  watts. 
We  must  calculate,  using  (7.6), 


I  = 


-AT 


-^pine  T 


asph 


-AT 

(pine)  +  ^p(asph) 


-AT  A  16x18x6 

=  — 7 - 7 - — - r  =  - — - =  5.67  kW.  Answer 

A£/fv(pine)  +  i?j(asph)  0.025/0.11  +  0.0776  - 


7.1  The  Continuity  Equation 


This  is  a  general  equation  describing  local  conservation  of  some  quantity.  Consider  an 
energy  density  gE  in  a  volume  V.  There  is  a  current  density  J  carrying  energy  across  the 
surface  A  of  the  volume.  In  a  time  df,  the  volume  loses  an  amount  of  energy  equal  to 
— d  j  gE  dV.  This  equals  the  amount  that  flows  out  through  the  closed  surface,  which  is 
<f  d tJn  dA.  So 


-d  /  gEdV=  i  dtJ-ndA  =  dt  /  V-J dV . 


That  means 


dH 


dgE 

dt 


+  V- J 


=  0  for  all  volumes  V. 


Since  the  volume  is  quite  arbitrary,  we  arrive  at  the  continuity  equation: 


dgE 

dt 


+  V- J  =  0. 


(7.8) 


(7.9) 


(7.10) 


This  idea  of  local  conservation  contrasts  with  global  conservation ,  in  which  a  quantity 
might  vanish  at  one  point  but  re-appear  at  another.  Although  the  quantity  might  well 
have  been  conserved,  there  may  have  been  no  flow  across  any  surface  in  between  the 
two  points  of  vanishing  and  emergence.  This  is  a  weak  form  of  the  idea  of  conservation; 
local  conservation  is  a  much  stronger  concept.  For  example,  energy  is  always  found  to  be 
conserved  locally. 
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7.2  The  Heat  Equation  (a.k.a.  Diffusion  Equation) 

Since  J  =  -kVT,  the  continuity  equation  becomes 

dQE  n 

— - kV  T  =  0  . 

dt 


(7.11) 


But  an  increase  in  energy  dE  in  the  volume  equals  mCsp  d T  where  the  mass  inside  is  m. 
So  divide  d E  =  m,Csp  dT  by  the  volume  to  get  d gE  =  gmCsp  dT  where  gm  is  the  mass 


density.  Thus 


dgE  rsP  dT 
dt  0m  '  dt  ' 


(7.12) 


Put  this  into  (7.11)  to  get 


(7.13) 


This  is  the  heat  equation ,  or  diffusion  equation.  It  has  been  produced  by  combining  the 
continuity  equation  with  the  experimental  observation  that  the  current  density  is  propor¬ 
tional  to  the  spatial  loss  in  T. 


7.3  Solving  the  Heat  Equation 

The  general  heat  equation  is 

V2T=-—  (K>  0).  (7.14) 

o 

We  can  see  why  this  might  well  model  the  flow  of  heat.  The  reason  is  because  V^T  is 
a  second  spatial  derivative.  When  there  are  no  sources  and  T  is  peaked  (a  hot  spot), 
the  second  spatial  derivative  is  negative,  which  means  dT/dt  is  also  negative.  So  the 
temperature  in  a  hot  spot  decreases  with  time.  Similarly,  when  T  is  a  trough  (a  cold 
spot),  the  second  spatial  derivative  is  positive,  which  means  dT/dt  is  also  positive.  That 
means  the  temperature  in  a  cold  spot  increases  with  time.  This  behaviour  is  just  what  we 
expect  of  temperature. 

The  heat  equation  is  linear,  from  which  it  follows  that  any  linear  combination  of  so¬ 
lutions  is  also  a  solution.  There  is  a  huge  literature  devoted  to  solving  partial  differential 
equations  that  relate  the  laplacian  operator  V  to  zeroth,  first,  and  second  time  deriva¬ 
tives.  The  topic  is  normally  found  in  applied  maths  courses,  so  here  we’ll  consider  just  one 
approach  useful  for  dealing  with  an  infinite  domain  (i.e. ,  no  boundary  conditions).  One 
solution  to  the  heat  equation  is 

T{t'X)  =  i^kp  eXP~'X4 Kt'  ■  <Pr°Veit!)  (7'15) 

This  is  a  (normalised)  gaussian  of  width  a  =  \/2Ki.  When  t  =  0  this  becomes  a  delta 
function,  5{x  —  x  ).  That  is, 

T(0,  x)  =  5{x  -  x)  =>  T(t,  *)  =  - 3/2  exp  ^  ^  .  (7.16) 

(47 xKty/2  AKt 
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Now,  suppose  the  initial  temperature  distribution  is  known  to  be  T(0,x)  =  f(x).  We  can 
certainly  write 

poo 

/(*)=/  f(x)S(x-x)  dV.  (7.17) 

We  have  expressed  the  initial  temperature  distribution  as  a  linear  combination  of  delta 
functions,  and  each  of  these  delta  functions  evolves  according  to  (7.16).  That  means  the 
general  solution  of  the  heat  equation  is 

/oo  1  I  ^  1 2 

f(x')  - — 3/2  exp  —  ^  dV.  (7.18) 

-oo  (dTrA^)^2  4  A  t 

^ - - V - ' 

Green  function  for  heat  equation 


So  provided  we  can  do  the  integral,  perhaps  numerically,  any  initial  temperature  distribu¬ 
tion  can  be  propagated  forwards  in  time. 

Equation  (7.18)  has  the  form 

/OO 

a(x )  b(x  —  x  )  dx  =  a(x)  *  b(x) ,  (7.19) 

-OO 

which  is  called  the  convolution  of  a(x)  and  b(x).  In  this  language,  (7.18)  is  written 

T(t,  x)  =  T(0,  x)  .  exp  h|L  .  (7.20) 

At  each  moment  in  time,  the  convolution  is  essentially  a  moving  spatial  mean:  it  acts  to 
smear  out  the  initial  temperature  distribution,  with  the  (gaussian)  smear  getting  wider 
and  wider  as  time  goes  on,  just  as  we  expect  will  happen  as  heat  flows.  The  theory  of 
convolutions  is  intimately  related  to  Fourier  and  Laplace  transforms. 


8  Integrating  the  Internal  Energy 

Consider 

E  =  J  dE  =  j (TdS  -  PdV+  //  dN)  .  (8.1) 

Because  T,  P ,  /j  are  intensive  and  S,  V,  N  are  extensive,  we  can  calculate  E  by  breaking  the 
final  complete  system  up  into  many  small  pieces,  each  with  T,  P,  //  at  their  final  values, 
and  each  piece  having  dS,  dE,  dN.  Then  simply  add  these  to  get 

E  =  TS  -  PV  +  nN  .  (8.2) 

It  might  seem  surprising  that  we  have  been  able  to  integrate  the  energy  so  easily,  but 
the  method  works  precisely  because  of  the  distinction  between  intensive  and  extensive 
variables  in  the  First  Law. 

Calculating  dA  once  more  from  (8.2)  and  then  applying  the  First  Law  produces  the 
Gibbs-Duhem  equation 

SdT—V dP  +  Ad/i  =  0  ,  (8.3) 

which  relates  changes  in  the  intensive  variables.  It  also  shows  that  if  one  of  these  intensive 
variables  changes,  then  at  least  one  other  intensive  variable  must  change. 
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8.1  Switching  Dependence  on  Variables 

How  to  switch  from  a  dependence  on  e.g.  S  to  its  conjugate  variable  T?  Define  the 
Helmholtz  Free  Energy  F  =  E  —  TS.  (It  is  called  free  energy  because  in  a  heat  engine  it 
is  the  energy  available,  or  “free”,  to  do  work.) 

dF  =  dE-SdT-TdS=  —S dT  —  PdV  +  y  dN .  (8.4) 

Compare  this  to  the  First  Law:  we  have  swapped  the  S  and  T  by  way  of  defining  a  new 
variable  F.  (This  technique  of  defining  a  new  variable  by  adding  or  subtracting  the  product 
of  the  relevant  conjugate  pair  is  an  example  of  a  Legendre  transform,  which  appears  in 
other  areas  of  physics.)  For  an  isothermal  nondiffusive  process,  dT  =  dlV  =  0,  so 

d F  =  —PdV  =  work  done  on  system.  (8-5) 

Similarly,  to  interchange  P  and  V,  define  the  enthalpy  H  =  E  +  PV . 

dH  =  TdS  +  V dP  +  n  dN .  (8.6) 

Isobar ic  nondiffusive  processes  are  of  great  relevance  to  chemical  reactions,  which  are 
often  performed  in  an  open  vessel  and  so  are  isobaric.  For  these, 

dH  =  TdS  =  heat  entering  system.  (8-7) 

Enthalpy  tells  us  whether  such  reactions  make  their  surroundings  hotter  or  colder.  The 
reaction  a  — >  b  has  a  total  heat  energy  entering  the  system  of  AH  =  Hb  —  Ha.  When 
AH  <  0  the  reaction  is  exothermic:  the  reaction  vessel  gets  hotter.  When  AH  >  0  the 
reaction  is  endothermic:  the  reaction  vessel  gets  colder. 

Define  the  Gibbs  Free  Energy  G  =  E  —  TS  +  PV  =  / jlN . 

dG  =  -SdT+V dP  +  n  dN .  (8.8) 

For  an  isothermal  isobaric  process,  dG  =  ydN  (the  N dp  is  absent  on  account  of  the 
Gibbs-  Duhern  equation).  So  the  Gibbs  free  energy  is  useful  in  analysing  diffusive  processes. 
The  Grand  Free  Energy  $  =  E  —  TS  —  yN  =  —PV  can  also  be  defined,  with 

dT  =  — SdT-  PdV—  N dp, .  (8.9) 

This  is  useful  for  describing  systems  with  constant  T,  V,  p  (isothermal,  isochoric,  nondif¬ 
fusive)  . 

8.2  Maxwell  Relations 

These  are  simply  expressions  of  mixed  partial  derivatives  from  the  First  Law.  Consider  a 
function  f(x,y,z).  We  have 


But  we  know  that  e.g. 


d  /  = 
d2f  _ 

dy  dx 


X  d x  +  Y  dy  +  Z  dz  . 

(8.10) 

d2f  .  dX  8Y 

dx  dy  ’  dy  dx  ’ 

(8.11) 
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or,  more  explicitly, 


ax;\ 

Qy  )x.z  ~ 


dY 

dx 


y,z 


Apply  this  idea  to  the  First  Law:  d E  =  T d S  —  PdV  +  //,  d N.  Then  e.g. 


dT\ 

dv)s,N  ~  Us/ 


(8.12) 


(8.13) 


and  so  on.  This  is  an  example  of  a  Maxwell  relation.  Others  follow  from  F,  G,  H.  For 
example,  using  d F  =  —  S  d T  —  PdV+  fi  d N,  we  get 


fdP\ 


\dN  J. 


T,V 


d»\ 

dV  )t,n 


(8.14) 


and  so  on.  These  relations  are  useful  in  allowing  us  to  change  variables  depending  on 
the  experimental  setup.  It’s  usually  best  to  use  as  independent  variables  ones  that  are 
either  constrained  or  easily  measured.  Because  three  variables  are  needed  to  describe  a 
system,  we  can  choose  one  of  each  type  (thermal,  mechanical,  diffusive).  E.g.,  what  is  the 
dependence  of  E  on  T,  P  for  a  closed  system?  (We  can  ignore  fi,  N  here  as  there  is  no 
diffusion.)  We  have 


lrn  fdE\ 

-4 


dr  + 


Si- 


We  can  consider  E  as  a  function  of  5,  V,  each  a  function  of  T,  P ,  to  write 

dE^  _  fdE^  fdS_^  fdE ^  f  dV^ 


dT )P  \dS)v  \dT)p+\dv)s \dT)p 


(8.15) 


(8.16) 


But  d E  =  TdS  -  PdV,  so  ( dE/dS)v  =  T  and  (dE/dV)s  =  -P.  Thus  (8.16)  becomes 


=r(dl\  - 

\dT  Jp 


dT)p  P 


dV\ 

df)p 


(8.17) 


Similarly,  {dE / dP)?  can  be  found. 

An  example  of  dependencies  that  are  easily  measured  starts  with 


■"  (8) 


(8.18) 


V,N 


This  leads  to  (assuming  here  that  N  is  constant,  so  we’ll  drop  reference  to  N): 


Cy  = 


CP  = 


AQ\  =  (TdS\  =T(dS\ 


dT  J, 


{dTjy  \dTjy  ’ 

d Q\  =  (TdS\  =T(^\ 


dT  J 


dT  J 


dT A 


(8.19) 


Also  define 


1  fdV\ 

coeff.  of  thermal  expansion  (3  =  —  —— 

V  \dT  Jp 

=  relative  increase  in  V  with  T  at  constant  P, 


(8.20) 
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and 


coeff.  of  isothermal  compressibility  k  = 


-1 

V 


=  relative  decrease  in  V  with  P  at  constant  T. 


(8.21) 


Another  useful  tool  arises  from  answering  a  question  such  as:  if  z  =  z(x,  y )  and  z  is  held 
constant,  how  do  x  and  y  relate?  Write 


0  =  dz 


Hence 


in  which  case 


For  example,  at  constant  N, 


( dz/dy)x  _  f dz\  ( dx\ 
~{dz/dx)y  \dy)x\dz) y 


P 


K 


(8.22) 


(8.23) 

(8.24) 


(8.25) 


Although  a  study  of  the  Maxwell  Relations  forms  a  useful  exercise  in  dealing  with  the 
various  partial  derivatives  that  arise  in  statistical  mechanics,  they  do  not  form  a  major 
part  of  the  modern  subject,  and  we  will  not  pursue  them  further. 


9  The  Chemical  Potential  and  Phase  Changes 


Consider  interactions  of  various  substances.  For  there  to  be  much  interaction  at  all,  the 
particles  should  be  able  to  move  about;  so  we  will  model  the  number  of  accessible  states  of 
a  complex  material  to  have  a  volume  term  just  like  a  gas  of  identical  particles,  combining 
the  ideas  of  (3.18)  and  (3.19): 


H  oc 


{  E  —  Nuq 

V  N 


vN/2 


(9.1) 


Set  g  =  N/V  =  particle  density.  Also,  Equipartition  says  E  —  Nu0  =  uNkT /2.  So 


H  oc 


—  N  rpVN/2 


(9.2) 


The  implied  normalisation  here  will  allow  H  to  be  dimensionless,  and  can  be  embodied 
in  a  sort  of  “reference  density”  g*  and  “reference  temperature”  T* .  In  that  case  the  en¬ 
tropy  S  =  k  In  H  would  be  written 

5  =  const.  -  Nk  In (q/q*)  +  V 2vNk  In (T/T*) ,  (9.3) 

since  we  can  only  take  the  logarithm  of  a  dimensionless  quantity.  But  in  practice,  g* 
and  T*  will  end  up  cancelling  out  anyway,  so  we  will  simply  absorb  them  into  g  and  T, 
and  write  the  entropy  as 


S  =  const.  —  Nk  In  g  +  l/2  vNk  In  T  . 


(9.4) 
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Use  this  to  calculate  //: 


HN  =  E  +  PV-TS  =  Nu0  +  1/2  vNkT  + 


f  JVfcT  (gas) 
(negligible  (liq./sol.) 


PV 


—  const,  x  T  +  JVfcT  In  g  —  i/2  uNkT  In  T , 


(9.5) 


— TS 


or 


/i  =  -u0  +  /cT  In  £  +  g(T)  for  some  function  g. 


(9.6) 


That  is,  seeking  a  lower  g  goes  with  seeking  regions  of  lower  “ambient”  energies  u0  and 
seeking  regions  of  lower  particle  densities. 


9.1  Colligative  Properties  of  Solutions 

These  are  properties  that  only  depend  on  the  concentration  of  a  dissolved  substance  (the 
solute ),  but  not  on  what  it  actually  is.  We  will  use  the  example  of  adding  a  small  amount 
of  common  salt  (the  solute)  to  pure  water  (the  solvent ),  which  approximates  sea  water. 
Sea  water  typically  has  2%  salt  by  particle  number,  meaning  there  are  approximately 
2  salt  ions  (we’re  not  concerned  with  whether  they  are  Na+  or  Cl  )  to  every  98  water  ions 
(which  could  be  H30+  or  OH-).  For  this  case,  set  a  variable  /  =  0.02. 

Initially  (before  adding  salt),  the  water  has  density  g0  =  N/V. 

Finally  (salt  +  water),  the  water’s  density  is  g  =  (1  —  f)N/V  =  (1  —  f)go- 
Focus  on  the  water’s  chemical  potential:  for  small  /  it  increases  by 


A n  =  jif  —  gi  ~  u0  +  kT  In  g  +  g(T) 

-uQ-  kT  In  g0  -  g(T) 

=  UTln  —  =  kT  ln(l  -  /)  ~  -fkT .  (9.7) 

Qo 

[This  is  a  good  example  of  how  g*  cancels:  we  might  choose  to  make  the  replacements  g  — >  g/ g  , 
£0  ~ >  Qo/ 6*  1  but  that  won’t  change  the  last  line  of  (9.7).] 

Start  with  (1)  water  in  equilibrium  with  its  vapor  (both  with  chemical  potential  fi), 
then  (2)  salt  is  added,  which  reduces  the  chemical  potential  of  the  water  to  g  —  fkT  and 
so  destroys  the  equilibrium,  then  (3)  liquid/vapor  equilibrium  is  again  restored,  with  a 
new  chemical  potential  g  .  There  have  been  overall  increases  in  temperature  and  pressure 
of  AT,  A P.  Now  make  use  of  the  Gibbs  Duhern  equation  (8.3),  writing  that  equation  as 

A^^AT+^AP.  (9.8) 

Also  set  the  entropy  per  particle  to  be  s  =  S/N,  and  the  volume  per  particle  to  be 
v  =  V/N.  Now  we  note  the  following: 


For  the  vapor,  in  going  from  (1)  to  (3)  in  the  previous  paragraph, 

A n  =  g  -  -svapAT  +  vvapAP . 


(9.9) 
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For  the  water,  in  going  from  (1)  to  (2),  there  are  approximately  no  changes  in  temper¬ 
ature  and  pressure  (this  is  a  simplification),  and  in  going  from  (2)  to  (3), 

A/x  =  ii  -  (n  -  fkT )  ~  -sliqAT  +  uliqAP .  (9.10) 

Combine  (9.9),  (9.10)  to  give 

-SvapAT  +  uvapA P  +  fkT  ~  -sliqAT  +  uliqAP  ,  (9.11) 

or 

-  («vap  -  Sliq)  AT  +  (uvap  -  uliq)  A P  ~  -  fkT  .  (9.12) 

Examine  this  last  equation  for  the  two  cases  of  constant  temperature  and  constant  pressure: 

Constant  Temperature 

(''Tap  -  Tiq)  A  P  ~  -  fkT .  (9.13) 

But  uvap  3>  Uiiq,  so  write,  for  the  vapor,  vAP  ~  —fkT,  and  treating  the  vapor  as  an  ideal 
gas  (so  Pv  =  kT),  this  becomes 

-A  P 

p  (9-14) 

That  is,  the  relative  drop  in  vapor  pressure  ~  /,  the  fraction  of  solute  particles  in  solution. 

Constant  Pressure 


(•Svap  -  Sliq)  AT  ~  fkT  .  (9.15) 

Follow  n  particles  leaving  the  liquid:  they  have  picked  up  the  latent  heat  of  vaporisation, 
which  increases  the  distance  between  particles  without  increasing  their  kinetic  energy  (and 
therefore  without  increasing  their  temperature).  They  each  have  mass  m,  and  the  specific 
latent  heat  of  vaporisation  (i.e.  per  unit  mass)  is  Lsp.  So 


svap  —  siiq  =  s  f°r  the  n  particles  after  entering  the  vapor 

—  s  for  the  n  particles  before  leaving  the  liquid 

AS  Q  ,  .  „  , 

=  - =  — ,  (where  Q  =  latent  heat  m) 

n  Tn 

Lspnm 

Tn 


(9.16) 


Thus  (9.15)  becomes 

AT.  fkT,  so  AT.g.  (9.17) 

If  the  latent  heat  is  specified  as  a  molar  latent  heat  Lmo1  (latent  heat  per  mole) ,  then  the 
last  equation  becomes 


fRT2  _  fRT2  _  fRT2 

LfpmNA  lat.  heat  for  1  mol  Lmo1 


(9.18) 
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Raising  Boiling  Points  We  add  salt  to  water.  That  lowers  the  chemical  potential  of 
the  water  by  fkT ,  so  that  the  vapor  has  a  higher  chemical  potential.  Thus  vapor  particles 
start  entering  the  brine  (because  diffusion  always  occurs  from  high  chemical  potential 
to  low  chemical  potential).  Consider  the  water  initially  boiling,  in  equilibrium  with  its 
vapor.  Add  salt,  and  vapor  starts  to  enter  the  brine.  If  we  wish  to  restore  the  equilibrium 
at  constant  pressure,  we  must  increase  the  temperature  by  AT.  That  is,  the  boiling  point 
of  the  brine  will  now  be  100°C  +  AT. 

Example:  Calculate  the  temperature  at  which  the  above  2%  salt  water  mixture  boils. 
Its  molar  latent  heat  of  vaporisation  is  Tmo1  =  40,700  J/rnol. 

Use  /  =  0.02  and  write 


AT  = 


fRT 2 

j  mol 


0.02  x  8.314  x  3732 
40,700 


~  0.6 K. 


(9.19) 


So  the  new  boiling  point  is  100. 6°C.  Answer 


Lowering  Melting  Points  Replace  the  water  vapor  in  the  above  discussions  with  ice. 
Now,  instead  of  vapor  particles  entering  the  brine,  ice  particles  enter  the  brine.  That  is, 
when  salt  is  added  to  a  water/ice  mixture,  the  ice  starts  to  melt.  To  restore  equilibrium  at 
constant  pressure,  we  must  now  remove  heat.  This  is  equivalent  to  making  Tmo1  negative 
in  the  above  equations.  If  the  molar  latent  heat  of  fusion  of  water  is  6000  J/rnol,  the  new 
melting  point  of  the  ice  is  0°C  +  AT,  where 


AT  = 


fRT 2 

j  mol 


0.02  x  8.314  x  2732 
-6000 


~  — 2K. 


(9.20) 


So  the  new  freezing  point  of  the  brine  is  — 2°C.  Because  adding  salt  to  an  ice/water  mixture 
causes  the  ice  to  melt,  this  is  one  method  of  de-icing  roads  in  winter — provided  the  ambient 
temperature  is  not  too  low. 


9.2  Osmotic  Pressure 

Suppose  some  water  (p)  is  separated  from  brine  (p  —  fkT )  by  a  membrane  through  which 
water  can  pass,  but  not  salt.  Imagine  this  was  originally  2  systems,  both  at  p.  Now  we 
lower  the  brine’s  potential  to  p  —  fkT.  Gibbs-Duhem  says  that  this  change  goes  together 
with  a  A P  and  a  AT : 


A  p  =  -  fkT  =  —  sbrincAT  +  ubrineAP  .  (9.21) 

At  constant  temperature,  there  will  now  be  an  osmotic  pressure  “forcing”  water  molecules 
to  diffuse  from  p  to  p  —  fkT  (high  to  low  potentials): 

A  P=(^)  A/x.  (9.22) 

\  *  /  brine 

We  know  the  pressure  acts  to  force  water  into  the  brine,  so  consider  absolute  values, 
writing  (with  the  volume  of  one  mole,  and  there  are  N  particles  of  water  in  the  brine) 

fN\  nr  N  fkT  fRT  ,noo, 

osmotic  pressure  =1  —  1  jkl=  — — - —  =  — —  .  (9.23) 

V  *  J  brine  J\[A  ^mol  *raol 
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Example:  What  is  the  osmotic  pressure  for  pure  water  diffusing  into  brine  at  25°C? 
0.02  x  8.314  x  298 


pressure  = 


18  = 


Pa  =  2.75  MPa  ~  27  atm. 


Answer 


(9.24) 


This  is  very  large!  The  salinity  of  humans  is  between  pure  water  and  sea  water: 


Msea  water  ^  Mhuman  ^  /Afresh  water 
osmotic  pressure 

So  if  we  drink  fresh  water,  it  diffuses  into  our  organs,  which  is  good.  But  if  we  drink  sea 
water,  pure  water  diffuses  out  of  our  organs  into  the  sea  water,  and  we  dehydrate. 


9.3  Chemical  Equilibrium 

Because  Gibbs  energy  is  G  =  E  —  T S  +  PV  =  fiN ,  or  really  chemists  tend  to 

use  G  to  discuss  chemical  equilibrium.  For  isothermal  isobaric  processes,  we  have  dG  =  //  dN. 
Now  focus  on  the  increase  in  G  for  two  systems  interacting  diffusively: 

dGx  =  Hi  d Ni , 

dG2  =  H'2  dlV2  =  — /i2  dAq  .  (9.25) 


Note  that  G  is  extensive. 

Why?  At  equilibrium,  G1  =  /.nV,  and  G2  =  pN2,  so  that  G1  +  G2  =  /riV  =  G.  So  G  scales  linearly 
with  the  system,  which  is  what  extensive  variables  do. 

Because  it’s  extensive,  we  can  write 

dG  =  dGx  +  dG2  =  {hi  ~  l-h)  dA^  ,  (9.26) 

and  regardless  of  whether  Hi  is  greater  or  less  than  /U2,  this  expression  will  be  negative. 
That  is,  G  tends  to  decrease,  until  at  equilibrium  it  must  be  a  minimum.  We’ll  use  this 
in  the  next  discussion. 

Direction  of  a  Reaction  and  Law  of  Mass  Action 

Suppose  we  have  molecules  A,  B ,  G  that  can  react  in  either  direction: 

aA  +  bB  cC .  (9.27) 

At  some  point  in  time,  it’s  known  that  the  species  have  chemical  potentials  Ha->  VBi  He- 
We  can  use  these  potentials  to  determine  the  direction  in  which  the  reaction  will  proceed, 
by  using  the  fact  that  G  always  decreases  on  the  way  to  equilibrium.  So  calculate  AG  for 
each  direction  of  the  reaction;  the  direction  for  which  it’s  negative  will  be  the  direction 
in  which  the  reaction  proceeds.  Previously  we  wrote  dG  =  //diV,  but  when  there  is  more 
than  one  particle  species  present  we  should  write  dG  =  /q  diVj,  or 

AG  ~  ^  ^  AN, .  (9.28) 
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Left  to  right:  When  the  mixture  loses  a  molecules  of  A  (i.e.  —AN a  =  a)  and  b  molecules 
of  B  (—ANB  =  b),  it  gains  c  molecules  of  G  (A Nc  =  c): 

AN  a  —  — a  ,  ANB  =  —b,  A  Nc  =  c.  (9.29) 

When  this  happens, 

A Gl_,r  ~  —a^A  ~  +  cfj-c  •  (9.30) 

Right  to  left:  Now  everything  is  reversed:  the  mixture  gains  a  molecules  of  A  [AN a  =  a) 
and  so  on.  Hence  all  the  signs  in  the  calculation  of  AG  are  reversed  from  above: 

AGr^l  =  —A Gl_>r  .  (9.31) 

Certainly  one  of  either  A GB_>B  or  A is  negative  (unless  they  are  both  zero,  in  which 
case  the  reaction  has  attained  equilibrium).  This  negative  one  tells  us  in  which  direction 
the  reaction  goes.  You  can  see  why  /j  is  called  the  chemical  potential. 


If  you  remember  that  A  always  refers  to  an  increase,  you’ll  always  get  the 
signs  right  in  analyses  like  the  above. 


The  increases  in  particle  numbers  for  the  left-to-right  version  of  the  above  reaction, 
written  in  (9.29),  are  called  its  stoichiometric  coefficients.  Consider  a  more  general  reaction 
with  stoichiometric  coefficients  bi,  b2,  63, ... .  As  in  (9.30),  we  have 

AGl^r  —  ^  tifti .  (9.32) 

But  we  saw  in  (9.6)  that  each  chemical  potential  can  be  written  as 


fii  =  kT  In  gi  +  functiorij(T) 
=  kT  [In  Qi  -lnCi(T)], 


(9.33) 


where  q%  are  the  particle  densities  and  Q  are  defined  for  convenience  in  the  following 
calculation.  Then 

A Gl^r  ~  b*kTiln  a  -  lnC*(^)]  >  (9.34) 

in  which  case 


AG 


exp 


L^R 


kT 


exp  E  bi  [  In  Qi  -  hi  C;(  0] 


UiQiT) 


(9.35) 


The  denominator  in  this  expression  is  called  the  equilibrium  constant  for  the  reaction, 
which  we’ll  write  as  A(T): 

A(T)  =  C?1  C22  C33  •  •  •  (9-36) 
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Examining  (9.35)  leads  to  the  following: 

He*  <  A(T)  ^  agl  ^r  <  0  reaction  goes  from  left  to  right, 

He*  >  A(T)  agl  >  0  reaction  goes  from  right  to  left, 

q k*  =  A(T)  <*=>•  A Gl^r  =  0  <*=>•  reaction  is  at  equilibrium.  (9.37) 

In  particular,  the  last  line  above  is  known  as  the  law  of  mass  action :  it  tells  us  the  densities 

of  the  various  species  present  at  equilibrium.  In  practice  the  gt  are  usually  expressed  as 
molar  densities  (e.g.  moles  per  litre),  which  is  okay  because  the  units  will  be  wrapped  up 
inside  A(T). 

Example:  Writing  “mol /£”  as  M,  the  equilibrium  concentrations  in  the  reaction 

2A  +  B  ^  5C  +  3D 


are  ^  =  1  M,  gB  =  2  M,  gc  =  3  M.  For  the  temperature  at  which  the  reaction  occurs, 

5 

the  equilibrium  constant  is  100  M  .  What  is  the  concentration  of  D? 

The  law  of  mass  action  says  that  at  equilibrium, 

Q~aQ~b16cqI  =  100  M5.  (9.38) 


Thus 


Qn  —  6 a  6b  6c  100  M5 

=  (1  M)2  (2  M)  (3  M)-5  100  M5  =  0.82  M3  ,  (9.39) 


from  which  it  follows  that 

gB  =  0.94  M.  Answer  (9.40) 

Note  that  if  we  had  gotten  the  relative  signs  wrong  in  our  set  of  stoichiometric  coefficients 
for  the  above  example,  the  units  wouldn’t  have  worked  out  right. 


10  Fluctuations  for  a  System  in  Contact 
with  a  Reservoir 

Statistical  methods  seemingly  can’t  be  used  to  study  small  systems  such  as  a  single  atom. 
But  suppose  a  small  system  is  in  contact  with  another,  larger,  system — one  so  large  that 
its  parameters  don’t  change  significantly  when  it  interacts  with  anything  else.  This  larger 
system  is  called  a  reservoir ,  or  a  bath.  Since  the  bath  can  be  treated  statistically,  we  can 
use  the  fact  that  the  bath  and  system  affect  each  other,  to  calculate  what  happens  to  the 
system. 

So  we  ask  the  question:  what  fluctuations  in  a  (smaller)  system’s  parameters  will  occur 
when  it’s  in  equilibrium  with  a  reservoir?  We  begin  with  a  preliminary  treatment  before 
delving  into  the  Boltzmann  Distribution  in  Section  10.2.  The  probability  that  the  system 
will  be  found  in  some  configuration  is  proportional  to  the  number  of  the  accessible  states  fl 
for  the  system  +  reservoir  in  that  configuration.  Focus  on  the  fluctuations  of  some  param¬ 
eter  x  £  {F,  T,  P.  V,  ji.  N}.  At  equilibrium  x  =  Xo>  but  now  a  fluctuation  Ax  occurs,  so 
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that  the  new  value  is  y  =  Xo  +  Ay.  We  require  the  probability  that  the  system  +  reservoir 
could  be  in  a  state  with  y  instead  of  y0.  The  probability  of  a  fluctuation  Ay  is 

p( Ay)  oc  J1(y)  =  eStot^x^k  ,  (10.1) 


where  5,tot  =  the  sum  of  the  entropies  of  the  system  and  reservoir.  Ignoring  all  other 
parameters  for  conciseness,  Taylor’s  theorem  gives 

S'tot(x)  =  StotCXo  +  Ay)  =  Stot(x0)  +  £tot(Xo)  Ay  +  ^tot(Xo)  Ay2/2  +  •  •  ■  (10-2) 

=  0  at  equilib. 

So  for  small  fluctuations  Ay, 


P( Ay)  oc  exp  S"ot(xo) 


Ay^ 

2k 


exp 


-Ay2 


2  a 


2  ’ 


(10.3) 


where  ax  is  a  gaussian  characteristic  spread  of  the  fluctuations.  Hence 


-k 


ax  = 


S'totiXo) 


(10.4) 


Because  other  variables  really  are  present,  we  should  write  this  as 


2 

<7V  =  - 

X  o2 


-k 


d  stot/dx 


Xo 


(10.5) 


Example.  What  is  aE/E  for  one  mole  of  an  ideal  monatomic  gas? 


We  require  d2 Stot/dE2  at  equilibrium.  Write  the  entropy  of  the  system  as  S  and  that 
of  the  reservoir  as  Sr,  then  use  conservation  of  total  energy,  volume,  and  particle  number 
to  write 


d  S 
d  Sr 


d E  PdV  p  d N 

- 1 - i - 

rjn  1  rj~i  rjn  5 

d Er  Pr  dVr  pr  d Nr 

rj-i  4~  rj-i  rj-> 


Add  these  to  get 


-d  E  PrdV  prd  N 

■  j  ■  rj-i  4~  ' j  ’ 


d5tot 


7fT  )dE  + 

-L  r 


Then 

(dSuA  =I_1 

V  i)E  )VN  T  Tr 

and  T,r  is  constant  (because  the  reservoir  is  huge,  by  definition),  so 


(10.6) 


(10.7) 


(10.8) 


dXA  =  1  ( dT\ 

BE2  )VN  T2\dE)ViN- 


(10.9) 
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Model  the  system  energy’s  temperature  dependence  as 


E  =  u0  +  vN 


kT 

~Y 


After  some  algebra, 


For  an  ideal  gas,  u0 


aE  _ 

E 

0,  and  we  get 


vNk2T 2 


2  (uq  +  uNkT/2Y 


aE 

E 


(10.10) 


(10.11) 


(10.12) 


The  1  / y/N  is  the  signature  of  a  relative  fluctuation.  For  one  mole  of  an  ideal  monatomic 
gas  this  is 


°JE  _  I  2 

E  Y 6= X  3 


10 


-12 


Answer 


(10.13) 


10.1  The  Concept  of  the  Ensemble 

When  calculating  quantities  relating  to  a  system  in  contact  with  a  bath,  it  can  be  helpful 
to  picture  a  large  number  of  identically  prepared  systems,  each  interacting  with  its  own 
bath,  and  each  in  some  different  (random)  stage  of  its  evolution.  This  imagined  set  of 
system-bath  pairs  is  called  an  ensemble.  We  can  treat  each  system  of  the  ensemble  as  a 
fixed  point  in  phase  space,  with  the  whole  assembly  of  points  comprising  the  path  that 
a  single  system  would  trace  out  in  phase  space  as  it  evolved.  This  idea  is  called  the 
ergodic  principle:  it  suggests  that  we  can  convert  averages  over  time  to  averages  over  the 
ensemble.  While  historically  the  ergodic  principle  has  never  been  completely  validated,  it 
is  used  frequently  in  statistical  mechanics. 

Ensembles  are  classified  by  the  extent  of  their  interaction  with  a  bath: 

No  interaction:  “nricrocanonical  ensemble”  (system  isolated;  its  energy  conserved) 

Thermal/mechanical  interaction:  “canonical  ensemble”  (air  molecules) 

Thermal/mechanical/diffusive  interaction:  “grand  canonical  ensemble”  (ice  crystals 
interacting  with  moist  air). 


10.2  The  Boltzmann  Distribution 

Suppose  the  interaction  increases  the  system’s  energy,  volume,  and  particle  number  by 
A E,  AV,  AN.  The  bath’s  parameters  become  E  —  AE,  V— AV,  N  —  AN.  The  probability  p 
that  the  system  will  have  the  stated  energy,  volume,  and  particle  number  is  proportional 
to  the  number  of  states  accessible  to  the  system  +  bath: 

p  oc  .  (10.14) 
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that  the  number  of  states  Qs  accessible  to  it  are  easily  enumerated.  On  the  other  hand, 
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the  bath  is  so  huge  that  we  can’t  easily  enumerate  the  number  of  its  accessible  states  DR 
from  first  principles;  however,  we  can  treat  it  statistically,  so  can  calculate  klR  from  a 
knowledge  of  its  entropy  SR  =  klnDR.  The  above  probability  is 

p  (xttseSR/k.  (10.15) 


The  First  Law  for  the  bath,  ER  =  TSR  —  PVR  +  pNR,  rearranges  to  give  the  bath’s  en¬ 
tropy  as 


Sr  = 


Er  +  PVr  ~  pNR 
T 


E  -  AE  +  P(V  -  AV)  -  p(N  -  AN) 
T 


(10.16) 


so  that  (10.15)  becomes 


p  oc  exp 


E  -  AE  +  P(V  -  AV)  -  /i{N  -  AN) 
kT 


Absorbing  E,V,N  into  the  constant  of  proportionality  gives 


(10.17) 


p  oc  kls  exp 


—AE  -  PAV  +  p  AN 
kT 


(10.18) 


We  are  free  to  measure  AE,  AV,  AN  relative  to  arbitrary  reference  levels  Eq,V0,  N0.  That 
is,  if  the  system  originally  had  E0,  V0,  N0  and  now  it  has  Es  =  E0  +  AE,  Vs  =  F0  +  AV, 
and  Ns  =  N0  +  AN,  then  (10.18)  becomes 


p  oc  f2s  exp 


oc  exp 


~(Es  ~  E0)  -  P(VS  -  V0)  +  p  (Ns  -  N0) 


kT 


—Es  —  PVS  +  p  Ns 
kT 


(10.19) 


where  the  system’s  energy  Es  might  not  all  have  come  from  an  interaction  with  the  bath. 

For  now  we’ll  restrict  ourselves  to  the  case  in  which  no  particles  are  exchanged  with  the 
reservoir.  The  PVS  term  is  significant  only  in  exceptional  circumstances  of  high  pressure, 
such  as  in  a  neutron  star.  So  usually  write  (10.19)  as 


(10.20) 

This  is  the  famous  Boltzmann  Distribution,  and  is  one  of  the  central  equations  of  statistical 
mechanics. 


p  oc  Ds  e 


~EJ(kT) 


Example:  Suppose  a  system  of  hydrogen  gas  is  in  contact  with  a  bath  at  T  =  295  K. 

st 

What  ratio  of  the  H  atoms  will  be  in  the  1  excited  state,  compared  to  the  ground  state? 
Use  E0  =  — 13.6eV,  E1  =  —  3.4  eV. 


P(E i)  =  Hi  e  Ei/(hl)  ^  -{El-E0)/{kT) 

P{E0 )  ft0  e~E°/(kT)  % 


(10.21) 


In  the  ground  state,  the  H  quantum  numbers  are  ( nlm )  =  (10  0),  with  two  possi¬ 
ble  electron  spins,  so  U0  =  2.  In  the  1st  excited  state  the  quantum  numbers  can  be 
(2  0  0),  (2  1  1),  (2  1  0),  (2  1  —  1),  each  with  two  spins,  so  U1  =  8.  We  have 


kT  ~ 


1.381=  x  295 
1.602  = 


eV  ~  0.0254  eV  . 


(10.22) 
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Then 

p{Ex)  8  -10.2  eV  _i74 

7  —  .  =  -  exp - —  ~  2  x  10  .  Answer  (10. 

p(E0)  2  1  0.0254  eV  —  v 

Clearly,  most  of  the  atoms  are  unexcited. 

It’s  useful  to  note  that  at  room  temperature,  kT  ~  1/40  eV.  Remember  this  number:  it  forms  a 
good  rule  of  thumb  to  help  you  determine  quickly  whether  much  of  a  system  is  excited  or  not. 
In  the  above  example,  the  energy  “distance”  to  the  first  excited  state  is  10.2  eV,  and  this  is  so 
much  larger  than  1/40  eV  that  we  can  see  immediately  that  there  will  be  almost  no  excitation. 
The  typical  energy  supplied  by  interactions  with  the  bath,  kT,  is  just  too  small  to  excite  many 
atoms — although,  fluctuations  being  what  they  are,  some  few  atoms  will  be  excited. 


In  general,  we  can  define  the  excitation  temperature  of  a  system  as  Te  such  that 

(10. 

The  excitation  temperature  gives  an  indication  of  the  temperature  at  which  an  appreciable 
number  of  particles  begin  to  occupy  the  1st  excited  state. 


Ei  ~  E0  _ 
kTe 


10.3  Diatomic  Gases  and  Heat  Capacity 

In  (6.11)  we  wrote  7  =  (1/ +  2)/v  in  the  context  of  heat  capacity.  At  room  temperature, 
7  is  measured  to  have  the  following  values: 

molecule:  HC1  NO  Cl2  Br2  I2 

7:  1.41  1.40  1.36  1.32  1.30 

These  values  begin  at  7/5,  so  diatomic  gases  such  as  HC1  and  NO  seem  to  be  rigid  rotors 
with  5  degrees  of  freedom:  presumably  3  translational  and  2  rotational.  For  a  non-rigid 
rotor,  we  expect  vibration  to  contribute  2  more  degrees  of  freedom,  making  7  =  9/7  ~  1.29. 
So  iodine  seems  to  be  a  non-rigid  rotor  at  room  temperature,  with  chlorine  and  bromine 
somewhere  in  between. 

It  turns  out  that  the  heat  capacity  for  any  particular  gas  varies  with  temperature 
as  the  degrees  of  freedom  go  from  purely  translational  to  translational  +  rotational,  to 
translational  +  rotational  +  vibrational.  Let’s  investigate  this  using  quantum  mechanics. 


10.3.1  Rotation 

A  rigid  diatomic  molecule  can  only  have  rotational  energy  levels  of 

where  <  =  0,1,2,... 

In  a  gas  at  temperature  T,  we  have  relative  populations 

N(Ee)  _  nee~Ee/{kT) 

N(Eo)  “p0e^/(fcT)’ 


(10.25) 


(10.26) 
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where  =  21  +  1  is  the  usual  quantum  mechanical  degeneracy  associated  with  the  set 
—£,  Thus 

=  (2£  +  l)exp~^1)^.  (10. 

Define  the  characteristic  temperature  of  (the  onset  of)  rotation  as  Tr,  where 


IT  -  h 

kTR  —  —  . 


21 


(10.28) 


Then 

^|4  =  (2f  +  l)e-«,+1>T»/T.  (10.29) 

N\Eo) 

If  T  <C  TR  then  N (Ep) / N (E0)  ~  0,  and  the  rotational  states  are  “frozen  out”.  The  rota¬ 
tional  energy  level  spacing  is  large  compared  to  the  ambient  supply  of  thermal  energy  h:T. 
which  is  why  rotational  states  can’t  be  activated. 

If  T  3>  Tr  then  N(Ef:) /N(E0)  >  0  and  the  rotational  energy  states  are  well  populated. 
The  rotational  energy  level  spacing  is  now  small  compared  to  kT.  With  so  many  rotational 
states  able  to  be  accessed,  a  gas  of  such  molecules  well  and  truly  has  rotational  degrees  of 
freedom,  and  can  be  treated  using  the  Equipartition  Theorem. 


Rotation  about  a  non-internuclear  axis  Consider  a  classical  picture  of  a  rigid  rotor 
formed  from  two  masses  lying  along  the  x  axis,  that  spin  around  the  z  axis  about  their 
centre  of  mass,  which  lies  at  the  origin.  The  masses  forming  the  rotor  are  roj,  distance  rq 
from  the  origin,  and  m2,  distance  r2  from  the  origin.  The  two  masses  are  a  distance  D 
apart.  What  is  Tr?  We  need  the  moment  of  inertia  Iz  for  rotation  about  the  z  axis: 

Iz  =  m±ri  +  m2r2  .  (10.30) 

But  if  the  centre  of  mass  is  at  the  origin,  then  — m1r1  +  m2r2  =  0.  We  can  use  this  to 
show  that 

o  111 

I~  =  /ID  ,  where  —  = - 1 - .  (10.31) 

H  rri\  m2 

fi  is  called  the  reduced  mass  of  the  system. 

Example:  Calculate  the  characteristic  temperature  of  rotation  for  CO,  given  that  the 
C  and  O  atoms  are  a  distance  of  0.112  nrn  apart. 

The  masses  of  an  atom  of  carbon  and  an  atom  of  oxygen  are  12  g /IV4  and  16  g /IV4 
respectively,  and  D  =  0.112  nrn.  Thus 

h 2  [6.626di/(2T)]2  x  6,022a(aliH  +  a_iIe) 

R  2 IJt  2(i D2k  2  x  (0.112=)2  x  1.38  = 

=  2.8  K  Answer  (10.32) 

So  at  room  temperature  a  gas  of  CO  molecules  has  many  rotational  states  occupied,  and 
the  Equipartition  Theorem  can  be  applied  to  it. 
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Rotation  about  the  internuclear  axis  For  the  same  masses  above,  we  can  use  a  clas¬ 
sical  picture  to  calculate  the  moment  of  inertia  Ix  for  rotation  about  the  x  axis,  i.e.  about 
the  line  joining  the  masses.  We  find  that  Ix  <C  Iz.  For  CO,  a  similar  calculation  to  the 
one  above  gives  the  characteristic  temperature  for  the  onset  of  this  type  of  rotation  as 
~  100,000  K;  hence  such  rotation  is  frozen  out  at  room  temperature.  In  fact,  quantum 
mechanically,  there  can  be  no  rotation  at  all  about  the  x  axis,  so  this  rotation  is  frozen 
out  at  all  temperatures. 


10.3.2  Vibration 

A  harmonic  oscillator  of  frequency  /  can  only  have  vibrational  energies 

En  =  (n  +  '/ 2)  hf  ,  where  n  =  0, 1, 2, . . .  (10.33) 


Relative  populations  are 


because  Qn  = 
as  Tv,  where 


N(En)  _  R„e~(n+1/2W/(fcT)  _  hf/{kT) 


N(E0)  n  e^hf/(kT) 


=  e  v”/ ,  (10.34) 

1  for  all  n.  Define  the  characteristic  temperature  of  (the  onset  of)  vibration 

kTv  =  hf  .  (10.35) 


Then 


N{En)  _  —nTv/T 

N(E0) 


(10.36) 


If  T  <C  Ty  then  N(En)/N(E0)  ~  0,  and  the  vibrational  levels  are  frozen  out.  The  vibra¬ 
tional  energy  level  spacing  is  large  compared  to  kT. 

If  T  3>  Tv  then  N (En) / N (E0)  >  0  and  the  vibrational  energy  levels  are  well  populated. 
The  vibrational  energy  level  spacing  is  now  small  compared  to  kT.  With  the  vibrational 
levels  well  occupied,  a  gas  of  such  molecules  has  vibrational  degrees  of  freedom,  and  can 
be  treated  using  the  Equipartition  Theorem.  Typical  values  of  Tv  for  light  molecules 
are  several  thousand  kelvins,  so  at  room  temperature  and  beyond,  these  molecules  don’t 
vibrate. 


10.4  Equipartition  for  a  System  Touching  a  Bath 


The  Equipartition  Theorem  was  stated  originally  in  Section  5.1  for  what  amounted  to  an 
isolated  system.  We  saw  there  that  each  degree  of  freedom  contributes  lj‘2  kT  to  the  total 
energy.  What  is  its  equivalent  for  a  system  in  contact  with  a  bath?  Fluctuations  now 
allow  the  system  to  have  a  range  of  energies.  For  such  a  case,  let’s  calculate  the  mean 
energy  contributed  to  the  system  by  each  degree  of  freedom. 

The  mean  thermal  energy  associated  with  any  particular  coordinate  u  is 

coo 

(Eu)=  Eup(Eu)dEu,  (10.37) 

Jo 


where 


p(Eu )  d Eu  =  prob.  that  the  system  has  energy  in  Eu  — ►  Eu  +  d Eu 

=  (prob.  system  in  a  state  with  Eu)  x  no.  of  states  in  that  interval.  (10.38) 
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The  probability  of  being  in  a  state  with  energy  Eu  is  oc  ^~Eu^kT\  The  number  of  states 
in  the  energy  interval  dEu  equals  the  number  of  states  in  the  coordinate  interval  d u.  In 
Section  3.1,  we  showed  that  this  number  is  proportional  to  the  length  dit  of  that  interval. 
Hence 

p{Eu)  dEu  =  Ae~Eu^kT^  drt ,  for  some  normalisation  A.  (10.39) 

As  before,  we’ll  just  consider  quadratic  energy  dependence:  Eu  =  bu"  for  some  b.  Calcu¬ 
late  A  by  demanding  that  J0°°  p{Eu)  d Eu  =  1.  (Note  that  it  suffices  to  consider  positive  u 
only,  since  Eu  is  even  in  u.  In  fact,  if  we  use  the  whole  range  of  u  we  will  get  a  dif¬ 
ferent  normalisation,  but  the  end  result  for  (Eu)  will  be  the  same.)  Using  the  gaussian 
integral  (2.27)  it’s  easy  to  show  that 


A  =  2 


nkT 


(10.40) 


Now  we  can  evaluate  (10.37): 


(Eu)  =  bu2Ae-bu^kT)  d u 

Jo 


=  bA  uue 

Jo 


,-&u/(fcT)du_  (10.41) 

This  last  equation  can  be  integrated  by  parts  (the  parts  are  u  and  u  e  ba  ^(-kT'>)  to  give 


(Eu)  =  i/2kT. 


(10.42) 


[For  an  alternative  way  to  integrate  (10.41),  see  (12.5)  and  (12.6).]  So  this  is  the  general¬ 
isation  of  the  Equipartition  Theorem  to  the  case  of  a  non-isolated  system.  Each  degree  of 
freedom  now  contributes  an  average  value  of  1/2  kT  to  the  thermal  energy. 


11  Entropy  of  a  System  Touching  a  Bath 

The  central  postulate  of  statistical  mechanics  is  that  the  accessible  states  of  an  isolated 
system  are  all  equally  likely.  Calling  the  number  of  these  H,  we  defined  the  thermodynamic 
entropy  S  =  k InH  in  Section  5.  From  a  purely  mathematical  viewpoint,  we  also  had  the 
statistical  entropy  a  =  lnfl,  which  just  omits  Boltzmann’s  constant. 

A  system  S  in  contact  with  a  bath  is  no  longer  isolated,  so  its  accessible  states  are 
in  general  not  equally  likely.  Can  we  still  find  an  expression  for  its  entropy?  Suppose  its 
states  are  labelled  |1),  |2), . . . ,  | M),  where  M  might  be  infinite.  If  the  probabilities  pt  of 
being  found  in  a  state  | i)  are  all  equal,  we  can  set  H  =  M  and  write  a  =  In  M.  When 
the  probabilities  are  not  all  equal,  we  form  an  (isolated)  ensemble  of  a  huge  number  N  of 
distinguishable  copies  of  S.  As  N  — >  oo,  the  number  of  copies  of  S  found  in  state  |i)  is 
ni  =  Npt. 

Now,  how  to  count  the  accessible  states?  This  number  will  be  hugely  dominated  by 
the  number  pertaining  to  equilibrium,  being  the  number  of  ways  that  we  can  arrange 
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ri\  systems  in  1 1) ,  n2  systems  in  |2),  etc.  Because  of  the  incredibly  highly  peaked  nature 
of  12,  we  can  set  17  equal  to  this  equilibrium  number  and  not  include  all  the  other  ways, 
such  as  rii  +  n2  copies  in  1 1) ,  none  in  |2),  n3  in  1 3) ,  etc.  We  calculated  this  number  for 
M  =  2  states  in  Section  2  by  considering  the  binomial  distribution.  There  we  found: 

TV! 

Number  of  ways  of  putting  n1  systems  in  |1)  and  ro2  systems  in  |2)  =  — - — - ,  (11.1) 

ni'.ntf- 


where  TV  =  n1  +  n2.  The  same  reasoning  applies  to  the  case  of  more  than  2  states.  Label 
TV  distinguishable  particles  1,2,...,  iV  and  allocate  each  to  one  of  M  bins;  then  apply  the 
approach  outlined  in  (2.1).  Each  combination  will  occur  n i !  n2!  . . .  n^j\  times,  so  the  total 
number  of  permutations,  TV!,  over-counts  by  this  factor.  Thus  the  required  total  number 
of  combinations  must  be 


TV! 

V  n2\  ...  nM\  ■ 


(11.2) 


The  actual  probability  distribution  that  extends  the  binomial  case  is  called  the  multinomial  distri¬ 
bution.  That  is,  if  the  chance  of  a  particular  particle  being  allocated  to  bin  i  is  pt,  then  what  is  the 
chance  P(n1,n2,  ■  ■  ■  ,nM)  of  finding  particles,  without  regard  for  order,  in  bin  i  for  some  given 
7ii,ri2,. . .  Each  combination  described  in  the  previous  paragraph  occurs  with  probability 

Pi  P2  ■■■  PJ  >  so 


P(nl7n2, . . . ,  nM) 


N\ 


— : - i - 7  Pi  P2  ■■■ 

np  n2\  ...  nM\ 


(11.3) 


This  is  useful  to  know,  but  we  only  need  the  total  number  of  combinations  N\/(jii\  n2\  . . .  nM\). 


The  number  of  combinations  in  (11.2)  pertains  to  the  whole  ensemble  so,  to  the  accuracy 


mentioned  above,  it  must  equal  flN ,  since  the  number  of  states  is  multiplicative  just  as 
entropy  is  additive.  In  that  case  the  statistical  entropy  is 

M 


a  =  In  12  =  — 
N 


1 

N 


In  IV!  —  In  Hj\ 
i= 1 

(N  +  f/2)  In  N  —  X  +  In  >/27t  —  ^(n*  +  f/2)  In  —  %  +  In  \/27r 


=  l 


N 


2N 


Mi 


N 


(11.4) 


(Note  that  this  expression  is  not  overly  changed  if  M  — *  oo,  since  the  states  can’t  all  have 
large  occupation  numbers.  Stirling’s  approximation  only  applies  to  those  with  large  ni7  and 
we’ll  ignore  the  rest  because  they  are  not  well  populated.)  Writing  ln(Arp):)  =  In  TV  +  In pi7 
regrouping  and  cancelling  terms  gives 


^  =  1  In  N  +  ( 1  -  M)  Pi  In  Pi  -  ^  Pi 


2  TV  v  TV 

-  Pi  In  Pi  as  TV  — >  oo 


(11.5) 
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This  last  expression  is  important  and  famous:  it’s  the  statistical  entropy  of  a  single  system, 
and  is  worth  rewriting  in  a  box: 


(11.6) 


(Thus  the  thermodynamic  entropy  is  S  =  —k  hi/r,.)  What  does  (11.6)  give  for  the 
entropy  of  an  isolated  system?  The  fundamental  postulate  of  statistical  mechanics  (on 
page  7)  says  that  all  states  are  equally  possible  for  an  isolated  system  in  equilibrium.  If 
the  M  states  are  all  equally  possible,  then  pi  =  i/m  for  all  i.  Then  (11.6)  becomes 


°  =  -  ^Pi^Pi  ■ 

i 


i= 1 


=  —  In  —  =  In  M  . 
M 


(11.7) 


This  is  just  what  we  expect  from  first  principles,  because  in  such  a  case  of  M  equiprobable 
states  we’d  write  11  =  M,  so  that  a  =  In  11  =  InM.  So  the  above  ensemble  analysis  is 
consistent  with  the  fundamental  definition  of  entropy  for  an  isolated  system  that  we  have 
been  using  throughout  this  course. 


11.1  A  Brief  Information  Primer 


The  expression  for  entropy  that  we  have  just  found  was  known  by  physicists  long  before 
it  was  rediscovered  in  a  new  context  by  Claude  Shannon  in  the  1940s,  who  pioneered  the 
field  of  information  theory. 

Central  to  information  theory  is  the  idea  of  the  probability  p%  that  the  next  symbol 
in  a  sequence  being  transmitted  will  be  symbol  i.  If  pt  ~  0  then  symbol  i  is  rare,  so  we’ll 
be  surprised  to  see  it;  in  that  sense,  it  has  a  high  “information-transmitting  potential”. 
If  pj  ~  1,  symbol  i  is  common,  so  we  are  not  surprised  to  see  it:  it  can’t  have  much 
information-transmitting  potential  if  it  was  expected  anyway.  Define  the  “surprise  at 
seeing  symbol  i ”  to  be  —  log bpi.  The  number  b  is  usually  set  equal  to  2  by  information 
theorists,  as  it  relates  to  storing  information  in  a  binary  way  using  e.g.  on/off  settings  of 
a  switch.  We’ll  leave  it  as  a  general  number  b  here.  In  fact,  for  any  a,  b,  c, 


log  ab 


log  cb 

logc  a  1 


(11.8) 


so  the  logarithm  to  any  base  is  just  equal  to  a  constant  times  the  natural  logarithm  that 
we  have  been  using  exclusively  in  statistical  mechanics. 

As  the  probability  p,t  that  the  next  symbol  will  be  i  ranges  from  0  to  1,  the  surprise 
—  log b  pi  associated  with  seeing  it  ranges  from  oo  to  0,  consistent  with  the  above  discussion. 
Our  average  surprise  on  seeing  the  next  symbol  (i.e.  averaged  over  all  symbols  of  the 
alphabet  being  used)  is 


average  surprise  =  (—log bp) 


Y  Pi  loSb  Pi  =  ]~f}Y  Pi 111  Pi  ■ 


(11.9) 


Note  that  we  could  have  used  another  expression,  such  as  1/pi  —  1,  for  the  surprise  at  seeing 
symbol  i,  that  would  also  range  from  oo  to  0.  However,  Shannon’s  original  analysis  showed 
that  (11.9)  is  the  only  possible  expression  for  what  we  have  called  the  average  surprise  that  is 
consistent  with  a  set  of  requirements  that  he  laid  down  for  such  a  quantity. 
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The  average  surprise  is  called  the  Shannon  entropy  of  the  alphabet,  and  usually  h  is  set 
equal  to  2;  in  other  words,  Shannon  replaced  the  Boltzmann  constant  of  our  expression  for 
entropy  by  1/\n2  in  his  context.  This  average  surprise  can  be  shown  to  be  maximal  when  all 
the  Pj  are  equal  (we’ll  do  so  below  for  a  2-symbol  alphabet).  In  that  sense,  a  high  entropy 
means  each  letter  is  being  well  used.  Such  an  alphabet  has  a  high  information-transmitting 
potential: 

information-transmitting  potential  =  Shannon  entropy 

=  average  surprise  when  6  =  2.  (11.10) 


Example:  Describe  the  information-transmitting  potential  of  an  alphabet  of  two  sym¬ 
bols;  i.e.,  when  two  symbols  are  available  to  be  transmitted.  Symbol  1  appears  with 
probability  pi  and  symbol  2  appears  with  probability  p2  =  1  —  P\  ■ 

There  is  only  one  free  variable:  choose  it  to  be  Pi .  Also,  to  show  that  base  2  isn’t 
actually  necessary  to  this  discussion,  we  won’t  set  6  =  2.  Then  (11.9)  produces 


I(pi)  =  inf.-trans.  potential 


-1 
In  6 


[Pi  In  Pi  +  (1  -  Pi)  ln(l  -  pi)] . 


(11.11) 


We  leave  it  to  you  to  show  that  1(0)  =  1(1)  =  0.  Also,  i' (p\)  =  \ogb(1/p1  —  1),  which  is  zero 
when  pi  =  !/ 2 .  This  leads  to  a  graph  of  I(p\)  versus  pi  that  is  everywhere  concave  down, 
rising  from  zero  at  the  endpoints  to  a  maximum  at  the  midpoint  of  Pi  =  P2  =  V2,  and 
symmetrical  about  that  midpoint.  So  the  information-transmitting  potential  (Shannon 
entropy)  of  this  small  alphabet  is  maximal  when  each  symbol  is  equally  allowed  to  appear — 
which  sounds  reasonable — and  zero  when  only  one  symbol  is  allowed  to  appear.  Again 
this  is  quite  reasonable;  after  all,  if  sentences  using  the  alphabet  were  primarily  composed 
of  just  one  of  the  symbols,  then  the  alphabet  wouldn’t  be  of  much  use. 

It’s  not  hard  to  show — using  the  method  of  Lagrange  multipliers — that  the  same  con¬ 
clusion  is  true  for  an  alphabet  of  any  length.  The  information-transmitting  potential  of 
an  alphabet  of  M  letters  is  again  maximal  when  each  letter  is  equally  allowed  to  appear, 
and  has  the  value  (—  logb  '/ M )  =  logb  M  where  6  is  arbitrary. 


Example:  We  can  calculate  the  information-transmitting  potential,  or  entropy,  of 
the  English  language  as  follows.  Take  a  typical  book  that  represents  the  language  as  it’s 
normally  used.  Count  the  frequencies  of  the  letters  and  punctuation  and  use  these  to 
compile  a  set  of  probabilities  for  those  characters.  For  example,  when  the  letters  “a”  to  “z” 
and  spaces  are  counted,  a  set  of  probabilities  Pi,  ■  ■  ■  ,p27  is  produced.  (These  are  not  all 
equal;  for  instance,  the  probability  of  an  “e”  is  comparatively  high,  and  so  on.)  One  such 
set  of  typical  probabilities  gives  —  )T) p^  In pi  ~  2.83.  Thus  the  information-transmitting 
potential  of  English  is  about  2.83  (using  the  natural  logarithm).  Now  suppose  that  English 
were  to  be  replaced  by  a  new  alphabet  in  which  each  letter  was  equally  able  to  appear.  How 
many  letters  would  be  required  to  match  the  information-transmitting  potential  of  English? 
Call  this  number  of  letters  M.  Then  the  fact  that  all  of  the  new  pi  are  equal  implies  that  the 
new  alphabet’s  information-transmitting  potential  is  In  M.  This  is  required  to  equal  2.83, 
so  M  =  e  ~  17.  That  is,  the  new  alphabet  would  need  just  17  letters. 

That  English  could  get  by  with  17  letters  does  not  imply  that  it  should  have  17  letters. 
Redundancy  in  information  flow  can  be  useful  because  it  makes  the  job  of  processing 
that  information  easier  for  our  brains — as  well  as  giving  us  time  to  savour  what  is  being 
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imparted.  If  English  were  to  be  pared  down  to  become  entropically  efficient  (but  perhaps 
lifeless),  our  level  of  concentration  would  have  to  increase  to  ensure  we  didn’t  miss  any 
details  of  its  tight  transmission.  That  would  tend  to  introduce  errors  of  its  own. 

Additionally,  if  our  task  were  to  trim  our  alphabet  to  a  set  that  fulfils  a  computer’s 
expectations  of  efficiency,  then  where  would  such  trimming  end?  Would  each  letter  be 
streamlined  to  minimise  its  use  of  ink  and  complexity  of  shape,  with  the  final  result  being 
that  calligraphy  is  reduced  to  nothing  more  than  strings  of  agitated  commas?  No,  and  for 
the  same  reason,  singers  don’t  replace  a  song’s  repeated  verse  with  the  word  “Ditto”.  Nor 
do  portrait  painters  depict  only  one  eye  of  their  subject,  arguing  that  the  other  is  more  or 
less  a  mirror  image  and  so  needn’t  be  drawn. 

Entropy  and  Information 

The  entropy,  or  information-transmitting  potential,  of  an  alphabet  is  usually  just  called 
its  “information”  by  information  theorists,  who  know  what  they  are  doing.  However,  we 
shouldn’t  be  misled  by  this  word  into  thinking  that  the  entropy  —  ^2  ‘Pi  ^Pi  is  somehow 
giving  the  information  content  of  whatever  was  sampled  to  give  the  set  of  pt .  The  concept  of 
information  as  it’s  usually  understood  (as  opposed  to  information-transmitting  potential) 
is  not  quantifiable,  and  certainly  cannot  easily  be  related  to  entropy.  For  example,  the 
entropy,  or  information-transmitting  potential,  that  we  calculated  above  for  English  has 
nothing  to  do  with  whatever  information  might  be  contained  in  the  sample  text  that  was 
used  to  compile  the  set  of  probabilities. 

To  see  further  why  entropy  and  information  content  cannot  simply  be  equated,  consider 
a  monkey  who  types  a  book  using  a  standard  26-letter  typewriter.  Each  letter  will  probably 
appear  about  !/26  of  the  time,  so  that  all  the  p{  equal  V26.  Now  suppose  I  write  a  book. 
I  use  a  new  language  in  which  each  word  has  26  letters,  with  each  letter  from  “a”  to  “z” 
appearing  exactly  once.  (There  are  26!  possible  words  in  such  a  language,  more  than 
enough  for  the  job.)  Each  letter  appears  l/26  of  the  time  in  my  book  too,  so  again  all 
the  pi  equal  i/26.  Yet  the  monkey’s  book  almost  certainly  carries  no  information,  whereas 
presumably  my  book  has  a  lot  of  information. 

So  a  book’s  information  content  is  not  simply  related  to  the  set  of  pp  nor  is  it  easily 
related  to  various  correlations  of  letters,  although  these  would  have  to  be  taken  into  account 
in  any  deeper  analysis.  The  deepest  analysis  would  completely  classify  ah  correlations  to 
such  an  extent  that  we  would  simply  end  up  reading  the  book;  nevertheless,  we  could  still 
only  make  a  value  judgement  on  how  much  information  it  contains.  Information  theory 
is  not  about  such  things.  Rather,  it’s  concerned  with  information-transmitting  potential. 
And  certainly  the  two  alphabets  of  the  two  languages  used  by  the  monkey  and  myself  do 
have  high  information-transmitting  potentials. 

11.2  The  Brandeis  Dice 

The  following  question  was  made  famous  by  the  statistical  physicist  E.T.  Jaynes  in  his 
1962  lectures  at  Brandeis  University. 

A  (possibly  biased)  die  is  thrown  many  times,  and  the  results  are  summarised  in  a  single 
statement:  the  mean  number  showing  on  the  top  face  is  5.  What  is  the  best  estimate  of 
the  probability  of  getting  each  of  the  numbers  1  to  6  on  the  next  throw?  The  mean  of 
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the  numbers  that  land  face  up  on  an  unbiased  die  will  be  3.5,  so  we  do  know  that  the 
probabilities  for  each  of  the  numbers  1  to  6  cannot  all  be  equal.  That’s  all  we  can  say  with 
certainty,  but  we  can  make  an  educated  guess  as  to  what  the  required  probabilities  might 
be. 

Set  Pi  to  be  the  probability  of  face  i  landing  up.  Jaynes  defined  the  sought-after  best 
estimate  of  this  probability  to  be  the  “blandest”  probability  function  consistent  with  the 
constraints  of  =  5  and  Y^l=i  Pi  =  1-  Why?  Because  we  hardly  expect  it  to  be 

otherwise;  yes,  the  function  might  have  an  interesting  peak:  the  die  might  have  a  5  on  each 
of  its  faces  so  that  p5  =  1  and  all  the  other  pt  =  0,  but  this  is  unlikely.  (We  can  still  model 
such  a  case  as  the  die  having  1  to  6  on  its  faces,  with  an  extreme — and  interesting! — bias 
such  that  only  the  5  face  ever  lands  up.) 

Suppose  we  construct  lots  of  estimates  of  this  set  of  six  probabilities  by  having  a  team 
of  monkeys  repeatedly  construct  “three-dimensional  metallic”  bar  graphs  by  dropping  a 
huge  number  N  of  coins  into  six  vertical  slots.  Jaynes’  approach  was  thus  to  choose  the 
most  common  distribution  of  coins  that  resulted.  That  is,  if  the  monkeys  drop  nt  coins 
into  the  zth  slot  (i  =  1, . . . ,  6),  then  we  wish  to  maximise  £l(pi,  ■  ■  ■  ,Pq),  the  number  of  ways 
of  obtaining  the  set  ul5 . . . ,  n6,  where  p{  =  rii/N. 

Suppose  that  in  general  there  are  M  slots  ( M  =  6  for  a  die).  Then 

Nl 

f^  =  — - - -  (11.12) 

n-p.  n2l  ...  nM\ 

But  we’ve  already  seen  this  in  (11.2),  and  we  know  that  when  N  3>  M,  maximising  17 
is  equivalent  to  maximising  —  'fZ+Li  Pi  hi  Pj.  Jaynes  made  this  an  entry  point  for  a  new 
approach  to  statistical  mechanics,  one  that  gave  pre-eminence  to  the  entropy  —  J2Pi^n Pi- 

Let's  generalise  the  die  further,  so  that  the  number  on  face  i  is  £).  If  the  average 
number  thrown  is  E,  what  are  all  of  the  p{!  We  need  to  maximise  —  ^pjlnpj  subject  to 

M  M 

^ ~2PiEi=E ,  ^2pi  =  l.  (11.13) 

i= 1  i— 1 

Extremising  an  expression  subject  to  constraints  is  usually  done  using  Lagrange  multi¬ 
pliers.  These  multipliers  are  unknowns  that  are  introduced  (one  for  each  constraint,  and 
called  a  and  (3  in  the  next  few  lines),  such  that  the  following  holds  for  each  variable  pf 

|- (- £p>P.)  =  a  +  (11.14) 

(The  method  of  Lagrange  multipliers  is  not  meant  to  be  obvious,  but  proving  why  it  works 
is  left  for  a  maths  course.)  Doing  the  partial  derivatives  gives 


—  In  pi  —  1  =  aEi  +  (3  for  all  i , 


(11.15) 


so  that 


normalisation 


—aEi 


e 


E*e 


—aE.: 


(11.16) 


where  the  (reciprocal  of  the)  normalisation  is  Z  =  JT  e  aEi,  a  useful  quantity  in  statistical 
mechanics  known  as  the  partition  function,  that  we’ll  meet  again  in  Section  15. 
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For  the  case  of  the  die  with  E  =  5  and  M  =  6  that  we  began  with,  Ei  =  i,  so  (11.16)  gives 

e-a  e-2 a  e-6 a 

Pi  =  ~2~,  P  2  =  —g-,  Pg  =  —^~-  (11-17) 

To  find  cc,  substitute  the  p%  into  (11.13)  to  yield  a  fifth-order  polynomial  whose  roots  must 
be  found.  When  this  is  done,  we  obtain 

Pi  ~  0.02  ,  p2  ~  0.04 ,  p3  ~  0.07 ,  p4  ~  0.14 ,  p5  ~  0.25  ,  p6  ~  0.48 .  (11.18) 

As  expected,  the  probabilities  are  larger  around  the  number  5. 

What  about  the  case  of  E  =  3.5?  This  is  the  mean  for  an  unbiased  die.  When  we 
apply  Jaynes’  procedure  as  above,  we  do  indeed  find  that  all  of  the  p{  =  i/6,  as  expected. 
It  should  be  no  surprise  to  find  that  the  blandest  die  that  gives  E  =  3.5  is  an  unbiased 
one. 

Jaynes’  “Brandeis  dice”  automatically  generate  the  Boltzmann  distribution,  although 
of  course  as  presented  in  this  short  section,  the  notion  of  a  temperature  still  needs  to 
be  introduced.  Nonetheless,  his  ideas  have  proved  to  be  extremely  fruitful  in  statistical 
mechanics. 


12  Distribution  of  Motions  of  Gas  Particles 

When  a  gas  is  in  contact  with  a  heat  bath,  its  particles  will  have  a  distribution  of  velocities 
governed  by  the  Boltzmann  distribution.  We  ask  two  questions: 

(a)  How  many  particles  will  be  found  in  the  range  of  velocities  from  v  to  v  +  du?  Call 
this  infinitesimal  number  N(v)  d3u,  where  d3u  =  dvxdvydvz.  The  function  N(v)  is 
the  Maxwell  velocity  distribution. 

(b)  How  many  particles  will  be  found  in  the  range  of  speeds  from  v  to  v  +  du?  Call  this 
infinitesimal  number  N(v)dv.  The  function  N(v)  is  the  Maxwell  speed  distribution. 

In  most  situations,  a  gas  can  be  considered  to  be  in  contact  with  a  heat  bath.  For  example, 
the  molecules  of  the  air  in  the  lecture  theatre  follow  a  Maxwell  distribution.  We  can 
calculate  these  distributions  as  follows. 


12.1  The  Maxwell  Velocity  Distribution 


Suppose  there  are  Ntot  gas  particles  each  with  mass  m,  and  that  they  are  distinguishable, 
so  that  we  can  consider  the  probability  that  any  particular  one  of  them  will  be  found  in 
the  velocity  interval  v  to  v  +  du.  This  is 


N(v)  du 


Nt 


=  (prob.  particle  in  vx  — >  vx  +  dux)  x  (ditto  vv)  x  (ditto  vz) 


tot 


=  (prob.  particle  in  state  with  E  =  '/2 rnvx  +  •  •  •  +  l/2mv2z  =  l/2mv2) 


KexP-5Ef- 


x  (no.  of  states  in  E  — >■  E  +  dE) . 


(12.1) 


=  dfitot  =  g(E)  dE 
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We  did  some  density-of-states  calculations  in  Section  3.1,  but  will  rerun  the  analysis  here 
to  show  an  alternative  approach.  In  Section  3.1  we  wrote  fItot  for  the  number  of  states 
with  energies  0  —*  E.  We  calculated  fltot  in  (3.9),  then  differentiated  it  in  (3.10)  to  get  the 
density  of  states  g(E).  But  as  we  wrote  in  (3.4),  dfltot  =  g(E)  d E.  So  instead  of  calculat¬ 
ing  fltot  we  might  choose  to  focus  on  dOtot,  the  number  of  states  in  the  range  E  — >  E  +  d E, 
as  we  did  in  (3.5): 


dfitot  =  (no.  of  “quant,  mechanical  cells”  for  x  coord.)  x  (ditto  y)  x  (ditto  z) 

=  =  w...w  w...bJ  .  (i2.2) 

ll  fl  h  ' - v - '  N - V - ' 


^space  =  mdvT  ...  mdvz  oc  d3v 


Then  (12.1)  becomes 


2 

-mv  o 


N(y)  d v  oc  e  2kT  d' v ,  or  N(v)  =  C  e  2kT 


(12.3) 


with  normalisation  C.  Determine  C  by  counting  the  particles: 


A^tot  =  /  iV(u)d3u  =  C 


/OO  /»00  /»00  2 

/  /  e^.d3u. 

-OO  J  —  OO  J  —  OO 


convert  to  polar:  =  v  sin  0  dv  d <j> 


/*27T  n  7T  /*oo  _  2 

=  C  dcj)  dd  sin  6  dvv2e  2kT 

Jo  Jo  Jo 

/OO  2 

r)  —mv 

dvv  e  2kT  . 


(12.4) 


This  integral  was  evaluated  in  Section  10.4.  Alternatively,  do  it  by  differentiating 


with  respect  to  a  (this  is  called  “differentiation  under  the  integral  sign”)  to  get 


(12.5) 

(12.6) 


Setting  a  =  27!T  gives  the  integral.  We  can  then  write  C  in  terms  of  Ntot ,  so  that  (12.3) 
becomes  a  gaussian: 

(12.7) 

Is  this  reasonable?  Imagine  drawing  a  bar  graph  of  the  numbers  of  molecules  in  a  room  ver¬ 
sus  the  x  components  of  their  velocities.  In  a  first  simplistic  analysis,  divide  the  molecules 
into  two  roughly  defined  sets:  half  are  moving  up/down  and  half  are  moving  left /right.  The 
half  that  are  moving  up/down  all  have  vx  ~  0,  so  draw  a  bar  of  height  1/2  Ntot  at  vx  =  0. 
For  the  molecules  moving  left/right,  half  move  left  and  half  move  right,  so  draw  two  bars 
of  height  l/4  Ntot,  at  equal  distances  somewhere  to  the  left  and  right  of  vx=  0.  We  see  a 
symmetrical  function  that  peaks  at  vx  =  0  beginning  to  take  shape. 

The  width  of  the  gaussian  in  (12.7)  (i.e.  its  standard  deviation)  is  y/kT /m.  As  might 
be  expected,  the  distribution  is  broadened  by  higher  temperatures  and  less  massive  gas 
particles. 
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12.2  The  Maxwell  Speed  Distribution 

We  are  usually  only  interested  in  the  speeds  of  the  particles,  not  their  directions  of  motion. 
What  results  is  the  Maxwell  speed  distribution. 

Let  N (v)  dv  be  the  infinitesimal  number  of  particles  found  in  the  range  of  speeds  from  v 
to  v  +  dv: 

N(v)dv=  j f  N(v)  d3v  =  [[  Ntot(^yf/2e^d3v 


all  directions 


=  Ntot  (ijrkf)  J0  J0  d°  sin6v^e  2kT  dv 

2  /  m  \3/2  g  -mv2 


m  \3/2 


all  directions 

2n  pTT 


=  N^-Akf) 


Thus  we  arrive  at 


2  f  m  \3/2  2  =21 

—  I  -  I  D  P  2) 


N(v)  =  Ntot\/-  vze^T 


(12.8) 


(12.9) 


(Remember  that  speed  is  always  non-negative,  so  v  ^  0.)  Compare  this  with  the  velocity 
distribution  (12.7):  aside  from  the  different  normalisation,  the  speed  distribution  has  an 
extra  factor  of  v  ,  which  pushes  its  peak  out  to  some  value  of  speed  greater  than  zero. 
We’ll  determine  this  value  soon. 


Alternative  Derivation  of  the  Speed  Distribution 

The  speed  distribution  is  sometimes  calculated  slightly  differently  to  what  we  have  done. 
We  derived  it  by  summing  over  all  directions  using  the  velocity  distribution.  But  we  could 
have  avoided  reference  to  the  velocity  distribution  as  follows.  We  have 

N(v)  du 


N< 


=  prob.  for  particle  to  have  speed  in  v  — >  v  +  dv 


tot 


=  (prob.  particle  in  energy  state  1/2  mv2)  x  (no.  of  states  in  v  — ►  v  +  dv)  .  (12.10) 


-w- 


=  dfitot  =  g(v)  dv  =  g(E)  d E 


Now  realise  that  dPtot  is  the  number  of  states  in  a  shell  of  infinitesimal  thickness  in 
momentum  space  at  energy  E  =  1/2  mv2: 

dPtot  =  (no.  of  “quant,  mechanical  cells”  for  x  coord.)  x  (ditto  y)  x  (ditto  z) 

\X\\Px]  N  [Pz] 


h 


h 


1 

J? 


x\ 


\Px]  ■  ■  ■  \pz } 


=  v„ 


volume  of  shell  in  3-D  momentum  space 


oc  volume  of  shell  of  radius  v  — >  v  +  dv 
oc  v2  dv . 

Putting  this  into  (12.10)  gives 


(12.11) 


!V(v)dvocWe  2kT  dv, 


(12.12) 
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which  can  be  normalised  as  before  to  arrive  at  (12.9)  again. 


On  a  side  note,  with  E  =  l/2mv2  implying  d E  =  mvdv,  we  can  also  write 


v 2  dv  = 


v 2  d  E 


vdE 


oc 


VEdE, 


mv  m 

as  was  found  in  Section  3.1.  This  gives  a  spread  over  energies  of 

-E 


N(E)  d E  =  N(v)  du  oc  \[E  e  kT  d E  . 


(12.13) 

(12.14) 


12.3  Representative  Speeds  of  Gas  Particles 

There  are  different  ways  of  producing  a  representative  speed  of  the  particles.  Four  standard 
ones  are  derived  from  the  Maxwell  speed  distribution  N(v)  in  (12.9).  They  are  not  all 
equally  important,  but  are  different  ways  of  approaching  the  idea  of  a  representative  value. 
Thus  it’s  useful  to  examine  each  briefly.  In  order  of  increasing  size,  they  are 


Most  likely  speed  v. 

entiation  leads  to 


This  is  found  by  solving  Nr(v) 


0.  The  straightforward  differ- 


(12.15) 


where,  as  usual,  R  is  the  gas  constant  and  Mmol  is  the  gas  particles’  molar  mass. 


Median  speed  vrn .  This  is  the  speed  at  which  half  the  particles  are  travelling  slower, 
and  half  faster.  Obtain  it  by  solving 


N(v)  du  = 


tot 


(12.16) 


Do  this  with  a  change  of  variables:  x  =  \/^Br  v->  al°ng  with 


/  x2e  x  dx  =  V/1  erf  x  —  —  e  x 

J  4  2 

to  arrive  at  (with  xm  =  \J^p  vm) 

2  _  2  i 

prf  nr1  _  _  rp  c>  ^  -  - 

—  •XjmK'  o  * 

y  7T  2 

This  is  solved  numerically,  resulting  in  a  median  speed  of 


7 r 


x 


2.366  kT  I  RT 

- ~  1.5 


m 


M 


(12.17) 


(12.18) 


(12.19) 


mol 


Mean  speed  v  or  (v). 

ro°  ,  .  ,  f°°  N(v)  du  [2  (  m  \3/2  3 

prob(u)  =  /  v — — - =  i  /  —  (  — ——  )  /  v  e  2kT  dv  . 

Jo  Ntot  V  vr  \kT )  J o 


V=  V 

Jo 


This  integral  can  be  done  by  parts,  writing  it  as  J0°°  v2  v  e  2kT  du.  We  get 


v  = 


8  kT 


~  1.6 


RT 


irm 


M, 


mol 


(12.20) 


(12.21) 


54 


DSTO-GD-0612 


RMS  speed  urms.  This  is  the  “(square)  root  (of  the)  mean  (of  the)  square  (of  the) 
speed”,  so 


2 


hms 


N(v)  du 
Ntot 


/  m  \3/2 

\kf) 


4  - 

v  e 


2kT  dv. 


(12.22) 


Do  this  integral  in  the  same  way  as  (12.5)— (12.6),  but  now  differentiate  twice  under  the 
integral  sign.  The  final  result  is  (remembering  to  take  the  square  root) 


Gms 


(12.23) 


This  value  makes  good  sense,  since  it  implies  that  the  mean  value  of  a  particle’s  energy  is 


(E)  =  (1/2  mv2)  =  ^(u2} 


mvr 


2 


m  3kT 
2  m 


3/2  kT, 


(12.24) 


just  as  we  expect  from  the  calculation  of  Section  10.4:  a  particle’s  average  thermal  energy 
per  degree  of  freedom  for  a  system  contacting  a  bath  is  */2  kT.  and  we  are  only  concerned 
with  translational  kinetic  energy  here,  so  there  are  3  degrees  of  freedom.  This  simple 
connection  with  the  average  energy  makes  the  mis  speed  perhaps  the  most  widely  used 
representative  speed  of  the  particles. 


13  Theory  of  Transport  Processes 

Here  we  study  several  processes  based  on  the  concept  of  the  mean  free  path,  that  combine 
to  allow  experiments  to  be  done  that  test  the  validity  of  the  models  we  have  been  using. 

13.1  Mean  Free  Path  of  Gas  Particles 

Consider  a  gas  with  all  particles  alike  but  distinguishable,  for  which  the  Maxwell  velocity 
distribution  will  apply.  We  wish  to  calculate  A,  the  particles’  mean  free  path,  or  the  mean 
distance  between  successive  collisions.  The  following  is  a  rough  argument  for  how  to  do 
this.  It  can  be  made  more  rigorous  by  examining  more  closely  the  relevant  interactions, 
but  for  our  purpose  it’s  fine.  We  assume  the  gas  isn’t  too  dense,  so  that  each  particle 
spends  most  of  its  time  in  free  flight.  This  is  a  very  good  approximation  for  all  manner 
of  gases — even  pseudo  gases  such  as  conduction  electrons  moving  about  in  a  metal,  that 
we’ll  consider  later. 

Suppose  a  particle  travels  at  mean  speed  v  for  a  time  At.  It  “carves  out”  a  tube  of 
length  vAt.  If  the  particles  it  encountered  were  all  at  rest  in  the  laboratory  (and  so  moving 
with  relative  speed  v  past  our  particle) ,  the  number  of  collisions  would  equal  the  number  of 
particles  whose  centres  are  in  this  tube.  For  a  particle  density  n,  this  number  of  particles 
(collisions)  equals 

n  x  volume  of  tube  =  ncrvAt ,  (13.1) 

where  a  is  the  collision  cross  section.  We  can  imagine  that  any  particle  at  a  distance 
farther  than  2 r  (where  r  is  the  radius  of  each  particle)  will  not  be  struck,  so  cr  =  7r(2 r)2. 
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In  this  case  then,  the  mean  free  path  would  be  approximately  the  tube  length  divided  by 
the  number  of  collisions,  or 

,  vAt  1  .  . 

A  ~ - —  =  (13.2) 

navAt  na 

In  practice  the  particles  are  not  at  rest,  so  they  don’t  come  past  our  particle  with  a  relative 
speed  v.  Suppose  instead  that  they  all  pass  our  particle  with  a  relative  speed  u,  and  we 
will  disregard  the  finer  points  of  averaging  over  the  various  directions  from  which  they 
came.  Then  the  number  of  collisions,  or  particles  encountered,  by  our  particle  is  as  if  our 
particle  were  travelling  at  u  for  a  time  At  through  stationary  particles.  In  that  case  (and 
remembering  that  the  tube  length  is  still  vAt), 


number  of  collisions  =  na  uAt ,  so  A  ~ 


vAt. 


na  uAt  nau 


(13.3) 


What  is  ul  Although  we  won’t  do  so  here,  we  can  use  the  Maxwell  velocity  distribution  to 
write  the  combined  probability  that  some  particle  (“particle  1”)  has  velocity  and  another 
particle  (“2”)  has  velocity  v2.  We  can  then  write  each  velocity  in  terms  of  the  particles’ 
relative  velocity  u  =  u1  —  v2  and  their  centre-of-mass  velocity,  V  =  l/2  (u1  +  v2).  We  then 
integrate  over  V  to  get  the  distribution  of  u,  and  find  the  mean  of  the  relative  speed  u  in 
the  same  way  as  was  done  in  Section  12.3.  There  is  a  change  of  6  variables  ulx, . . . ,  v2z  to 
Vx,...,uz  needed  (requiring  the  determinant  of  a  6  x  6  matrix,  but  that  turns  out  to  be 
straightforward),  and  some  more  integration.  We’ll  omit  the  details  and  simply  write  the 
answer: 

u  =  vV 2.  (13.4) 

Hence  (13.3)  becomes 


A  ~ 


1 


nay/2 

We  can  also  write  down  the  collision  frequency: 


collision  frequency  = 


no.  of  collisions  na  uAt 


At 


At 


=  nav 


V2. 


(13.5) 


(13.6) 


Example:  What  are  the  mean  free  path  and  collision  frequency  for  particles  in  the 
air  of  our  lecture  theatre? 

Give  these  particles  a  temperature  of  300  K  and  a  pressure  of  10  Pa.  Use  the  ideal  gas 
law  PV  =  NkT  to  write  the  particle  density  as 


n  = 


no.  of  molecules  in  V  N  P 
=  V  =  kT 


V 


and 

So  the  mean  free  path  is 
1  kT 

A  — 


a  =  4nr2  ,  where  r  ~  10  10  m. 


1.38=  x  300 


nay/2  PAnr2  y/2  105  x  4vr  x  10-20  x  y/2 

The  collision  frequency  is 


nr  ~  0.23  //rn. 


r  P  2 
navy/2  =  —— 
kT 


8  kT  r  P  2 

- v2  =  — -  47t r 

irm  kT 


16  RT 


7 tM, 


mol 


(13.7) 

(13.8) 


Answer  (13.9) 


(13.10) 
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For  air,  Mmol  ~  0.8  x  28  +  0.2  x  32  =  28.8  g,  so  the  collision  frequency  is 


10°  „  _20  / 16  x  8.314  x  30  0  9 

- ™ - x  47T  x  10  \  - ~  2.0  x  10J  s  . 

1.38=  x  300  V  vrx  0.0288 


(13.11) 


Two  thousand  million  collisions  per  second  is  an  extraordinary  number,  but  there  are,  of 
course,  many  air  molecules  in  the  room. 


13.2  Viscosity 

Our  discussion  of  mean  free  path  in  a  gas  can  be  applied  to  connect  ideas  of  viscosity, 
thermal  conductivity,  and  heat  capacity  for  a  gas.  More  generally  it  can  also  be  applied 
to  a  fluid  (being  a  substance  that  takes  the  shape  of  its  container),  provided  the  fluid’s 
internal  processes  are  not  so  complex  as  to  negate  the  assumptions  that  we’ll  make. 

Consider  a  fluid  with,  say,  a  metal  plate  on  top  in  the  xy  plane  that  is  pulled  with 
some  constant  velocity.  The  plate  will  drag  fluid  particles  along  with  it.  For  gases,  a  good 
model  of  the  situation  is  that  the  gas  is  composed  of  a  stack  of  plates  in  the  xy  plane, 
where  the  plate  at  z  =  constant  experiences  a  force  that  drags  it  against  internal  friction 
along  the  x  direction  with  velocity  ux(z).  As  we  pull  on  the  top  plate,  excess  x  momentum 
is  transferred  by  random  particle  motion  down  through  the  plates,  which  drags  them  in 
turn — although  the  lower  the  plates  are,  the  lesser  the  x  velocity  they’ll  inherit.  (We  dis¬ 
regard  the  ever-present  mean  x  momentum,  which  must  cancel  over  large  areas,  otherwise 
the  plates  would  move  spontaneously.  The  only  x  momentum  used  below  is  the  excess 
responsible  for  the  plates  moving.) 

We  wish  to  relate  the  force  required  to  drag  the  top  plate  to  some  measure  of  the  gas’s 
viscosity.  First,  note  that 

momentum  transferred 

force  applied  =  - - - 

time  taken 

So  as  the  plate  at  z  =  constant  is  dragged  in  the  x  direction,  we  can  define  a  quantity 
sometimes  called  Tx~  by 

x  component  of  force  needed 
to  drag  the  unit-area  plate  of 
gas  at  z  with  constant  velocity 

=  net  px  transferred  up  through  plate  at  z,  per  unit  area  per  unit  time 

no.  of  particles  transferred  up 

=  (px  per  particle)  x  through  plate  at  z,  per  unit  area  .  (13.13) 

per  unit  time 

With  n  particles  per  unit  volume,  consider  that  n/3  have  some  motion  in  the  2  direction, 
with  half  of  these  going  up  and  half  going  down.  So  the  n/6  going  down  carry  a  higher 
x  momentum  down  (which  accelerates  those  bottom  layers),  and  the  n/6  going  up  carry  a 
lower  x  momentum  up.  Thus  x  momentum  leaks  down  through  the  plates,  pulling  them 
to  the  right  with  decreasing  force  the  lower  it  goes. 

The  number  of  particles  passing,  say,  up  through  the  z  plate  per  unit  area  per  unit 
time  is  their  flux  density  in  the  z  direction.  (Flux  density  is  very  often  just  called  flux,  but 


(13.12)  px  transferred  to  plate 
area  x  time  taken 


(13.12) 
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this  conflicts  with  its  most  well-known  use,  in  electromagnetism.  So  we  won’t  call  it  flux.) 
If  we  consider  N  particles  per  unit  volume  passing  with  speed  v  for  a  time  At  along  a  tube 
of  cross-sectional  area  A.  then  the  flux  density  through  the  tube’s  end  face  is  governed  by 
how  many  particles  passed  through  the  face  in  this  time: 


no.  through  face  NAvAt 

nux  density  =  -  =  — — - =  Nv . 

area  x  time  AAt 


(13.14) 


As  used  here,  “flux”  is  synonymous  with  current.  Flux  can  refer  to  the  motion  of  anything,  such  as 
particles,  mass,  electric  charge,  as  well  as  being  applied  to  field  lines  in  electromagnetism.  More 
generally,  flux  density  is  a  vector  that  can  be  dotted  with  the  normal  to  an  area  to  tell  us  the  flux 
through  that  area.  For  the  flow  of  a  substance,  the  expression 


flux  density  =  substance  density  x  substance  velocity 


(13.15) 


is  well  worth  remembering.  Note  that  the  first  word  “density”  in  (13.15)  refers  to  the  unit  spatial 
area,  not  the  unit  time;  that  is,  flux  density  is  flux  per  unit  area,  where  flux  means  how  much 
substance  flows  per  unit  time.  So  flux  equals  flux  density  times  an  area  (not  times  a  time).  The 
second  word  “density”  in  (13.15)  refers  to  the  amount  of  substance  per  unit  volume. 


In  our  case,  N  =  n/ 6  in  each  direction  across  a  plate,  and  the  flux  density  in  each 
direction  is  then  nv/ 6.  So  (13.13)  says  that  the  average  “up”  contribution  to  Txz  is 
nv/ 6  x  (px  per  particle).  The  px  carried  by  each  particle  is  what  that  particle  inherited 
at  its  last  collision,  which  happened  a  distance  of  approximately  A  from  the  plate  at  z. 
For  “up”  motion  (say,  the  direction  of  increasing  z),  this  momentum  is  formed  from  the 
particle’s  velocity  at  z  —  A,  so  is  approximately  mux(z  —  A).  Similarly,  the  momentum 
carried  down  is  m.ux{z  +  A).  The  net  momentum  travelling  up  through  the  plate  is  then 


nv  ,  , .  nv 

T  ~  =  —mux(z  -  A) - — mux(z  +  A) 

D  D 


“up”  part 


“down”  part 


nvm 

6 


— nvm.X  du~ 


yffifz)  —  ux(z) A  —  —  ux(z) A]  (a  Taylor  expansion) 


dz 


=  -r] 


diLx 

dz 


(13.16) 


where  we  have  written  a  partial  derivative  to  show  that  more  generally  ux  depends  on  y 
as  well,  and  where  rj  =  nvmX/3  is  the  coefficient  of  viscosity.  Since  our  top  metal  plate  is 
only  moving  in  the  x  direction,  Txz  is  the  whole  force  per  unit  plate  area  needed  to  drag 
it,  and  so  rj  can  be  measured. 

Writing  p  in  terms  of  what  we  know  already, 


n  _  n  I  8  RT  Mmol  1 

\/4  -Rr  A/mol 

77  ==  —  V7Tl\  ——  —  \  / - =  - 

3  3  y  7rMmol  Na  naffil  3ffirrNAa 


(13.17) 


Note  that  the  gas’s  particle  density  n  cancels;  the  surprising  result  is  that  viscosity  is 
independent  of  this  density  (at  a  given  temperature).  This  was  both  derived  and  experi¬ 
mentally  confirmed  by  Maxwell. 

A  gas  of  close-packed  particles  each  with  radius  r  has  a  mass  density  in  the  region  of 


rn  =  Mmol 

3  A  T  3 
r  JyJ^ 4  r 


(13.18) 
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Equations  (13.17)  and  (13.18)  provided  the  first  values  of  r  and  IV4  by  Loschmidt  in  1885. 

Equation  (13.17)  predicts,  correctly,  that  the  viscosity  of  a  gas  increases  with  temper¬ 
ature.  The  viscosity  of  liquids  actually  decreases  with  temperature.  For  liquids,  we  must 
add  something  to  the  model:  the  particles  are  so  close  together  that  particles  on  adjacent 
planes  will  interact  with  each  other,  even  though  they  are  not  crossing  the  planes.  But 
that’s  outside  this  course. 


13.3  Thermal  Conductivity 


In  Section  7  we  looked  at  thermal  conduction  from  a  bulk  point  of  view.  Now  we  investigate 
the  coefficient  of  thermal  conductivity  k  with  an  atomic  approach  almost  identical  to  the 
above  for  viscosity.  In  particular,  consider  a  plate  in  the  xy  plane  at  some  temperature  T. 
The  flux  density  of  thermal  energy  across  a  plate  at  z  =  constant  is  the  z  component 
of  (7.1),  or 

dT 

Jz  =  -k  —  .  (13.19) 


In  fact  J,  has  another  name:  Ttz ,  where  T tz  and  Txz  are  2  of  10  components  of  the  stress-energy 
tensor  (which  is  not  part  of  our  course).  Suffice  it  to  say  that  if  we  arrange  for  the  speed  of  light 
to  be  dimensionless,  then  the  components  of  the  stress-energy  tensor  all  have  units  of  pressure. 
Einstein  postulated  that  stress-energy  curves  spacetime;  that  is,  stress-energy  is  to  Einstein  what 
mass  is  to  Newton:  the  source  of  gravity. 


Equation  (13.19)  holds  well  for  liquids  and  gases,  as  well  as  solids  with  sufficient  homo¬ 
geneity.  It’s  analogous  to  (13.16);  for  viscosity  we  looked  at  the  transfer  of  x  momentum, 
but  now  we  look  at  the  transfer  of  thermal  energy  E.  Particles  at  height  z  each  have 
thermal  energy  E(z).  We  have 


Jz  =  net  heat  transferred  up  through  plate  at  z,  per  unit  area  per  unit  time 


=  (heat  energy  per  particle)  x 


no.  of  particles  transferred  up 
through  plate  at  z,  per  unit  area 
per  unit  time 


nv  _ .  , .  nv  „ . 

—E(z  —  A)  -  —E(z  +  A) 


up 


“down” 


^  —  \prf- E\z)  A -ptf-E'(z)  A] 
—nv  A  dE 

=  3  a*  ’ 


(13.20) 


where  again  we  have  used  a  partial  derivative  for  generality.  But  if  we  just  consider 
heat  conduction  in  the  z  direction,  we  can  use  an  ordinary  derivative,  and  equate  (13.19) 
with  (13.20): 


dT  —nv A  dE 
dz  3  dz 


(13.21) 


But  that  means 


nv  A  dT 
~3~dT 


TIV X  ^ i  particle  _  nv\  Cy 


(13.22) 
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Now  use  this  last  equation  along  with  =  nvm.X/3  to  write 


K 

V 


a 


mol 

V 


M 


mol 


=  c$P. 


(13.23) 


So  we  have  related  viscosity,  thermal  conductivity,  and  heat  capacity  using  an  atomic  view 
of  matter.  Experiments  yield  values  of 


-  «  (1.5  -»•  2.5)  x  Cf?.  (13.24) 

rj 

Their  approximate  agreement  with  theory  forms  a  good  justification  for  the  validity  of  the 
kinetic/atonric  models  that  we  have  been  using.  We  can  easily  expect  to  be  out  by  a  factor 
of  2  in  our  calculations,  as  these  are  based  on  heuristic  models  with  a  heavy  reliance  on 
averaging.  But  we  haven’t  done  too  badly. 


14  Bands,  Levels,  and  States 


Discrete  values  of  energy  that  a  system  can  have  are  called  energy  levels ,  and  each  one  may 
contain  some  degeneracy;  that  is,  there  might  be  several  states  all  with  the  same  energy. 
When  energy  levels  are  very  close  together  so  as  to  form  more  or  less  a  continuum,  the 
language  used  to  describe  them  can  be  a  little  confusing.  Such  a  set  of  such  energy  levels 
is  called  an  energy  band,  and  the  levels  and  states  comprising  it  are  treated  together  and 
collectively  just  called  energy  states. 

As  an  example,  consider  two  energy  levels.  Suppose  that  for  a  system  in  contact 
with  a  bath  at  17°C  (, kT  =  0.025  eV),  level  1  has  energy  E1  =  1  eV  and  degeneracy  gx  =  n 
states.  Level  2  has  E2  =  1.02  eV  and  g2  =  n  states  also.  What  is  the  ratio  of  the  numbers 
of  particles  in  these  two  levels?  If  there  are  Ntot  particles  present,  then  the  number  of 
particles  in  level  i  is 


Nt  =  Ntot  x  probability  for  each  particle  to  have  energy  Ei 
=  Ntot  x  E  prob.  for  a  particle  to  be  in  a  state  at  level  Ei 

states 

=  NtotgiCexp-^  ,  (14.1) 

where  C  is  a  normalisation.  Then 


N2 

N, 


NtotnCexp^§  =  -0.02 

Ntot  n  C  exp  0.025 


~  0.45  . 


(14.2) 


Suppose  now  that  the  n  states  in  level  2  are  spread  out,  forming  an  energy  band  from 
1.02— >1.03  eV.  (These  energy  states  have  now  become  energy  levels!)  Now  what  is  the 
ratio  of  numbers  of  particles  in  band  2  to  level  1  (a.k.a.  band  1)?  Let’s  first  approximate 
band  2  as  two  energy  levels  (at  1.02  and  1.03 eV),  each  with  degeneracy  n/2  states.  Thus 


5i  —  n  >  <?i.02  —  51.03  —  n/2  • 


(14.3) 
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Then 


N2 

N\ 


Ntot 


f  C  exp  +  Ntot  |  C  exp 
iVtot^Cexp 


-1.03 

0.025 


=  rxp 


-0.02 

0.025 


+  rxp 


-0.03 

0.025 


~  0.375. 


Now  let’s  do  better  by  approximating  band  2  as  three  energy  levels  (at  1.02 
1.03 eV),  each  with  n/3  states: 


9l  ~  n  1  Sl.02  —  51.025  —  9l.03  ~  n/3  ■ 


Then 


N2 

Ah 


■jVtat  gCexp 


-1.02 

0.025 


+  Artot  |  Cexp  +  Ntot  §  Cexp 
ATtotnC,exp=^ 


1  -0.02  1  -0.025  1  -0.03 

=  o  eXP  717^  +  o  eXP  n  no  ir  +  o  eXP  ~  0-373  • 


0.025 


0.025 


0.025 


Suppose  we  could  spread  the  n  energy  levels  out  evenly  over  the  band.  Then 


N2  =  Ntot  x  prob.  for  a  particle  to  be  in  band  2 
=  Ntot  prob.  for  a  particle  to  be  in  a  level  E 

levels 

=  Jv™  E  (prob.  for  a  particle  to  be  in  a  state  around  level  E) 

levels 

x  (number  of  states  around  level  E) . 


Imagine  that  in  fact  we  spread  the  n  levels  out  over  the  band’s  0.01  eV  width  to 
continuum.  Then  the  number  of  states  (or  levels!)  around  E  is 

71 

g(E)  d E  = - d E. 

yv  '  0.01  eV 

For  such  a  case,  (14.7)  becomes 


N2  =  Ntotj  Cew  g(E)dE 

/•1.03  eV 


=  N< 


n 


tot 


=  Ah 


0.01  eV 
n 


C 


e^T  d. E 


tot 


J  1.02  eV 
C(— 0.025  eV) 


0.01  eV 

and  with  N\  unchanged  from  (14.2),  we  get 


exp 


-1.03 

0.025 


exp 


-1.02 

0.025 


N2 

Ah 


-0.025 

0.01 


exp 


-1.03 

0.025 


—  exp 


-1.02 

0.025 


~  0.370. 


(14.4) 
1.025, 

(14.5) 

(14.6) 


(14.7) 
form  a 

(14.8) 

(14.9) 

(14.10) 
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More  generally,  (14.7)  can  be  written  as 

i-Eg+AE  _E 

number  of  particles  =  Ntot  /  CekT  x  g(E)  dE  (14.11) 

rn  E0  -»  E0  +  A E  E0  prob.  to  be  in  n^teTrf^tates 

state  around  E  around  E 


or 


number  of  particles 
in  E  — y  E  +  d  E 

' - V - ' 

=  N  (E)  dE 

number  of  particles  per 
unit  energy  interval 


=  NtotCekT 

" - V - ' 

=  n(E),  mean 
number  of 
particles/state 


x  g(E)  dE 


number  of  states 
per  unit  energy 
interval 


This  is  written  more  compactly  as 


(14.12) 


N(E)  =  n(E)  g(E)  .  (14.13) 

Systems  are  characterised  by  n(E),  the  mean  number  of  particles  per  state  (or  occupation 
number ),  and  g(E),  the  density  of  states,  or  “spectrum  of  accessible  states”.  When  the 
number  of  particles  is  much  less  than  the  number  of  states,  the  Boltzmann  distribution 
holds,  meaning 

n(E)  o cew.  (14.14) 

In  contrast  to  the  occupation  number,  the  density  of  states  g(E)  varies  widely  from  system 
to  system. 


15  Introducing  Quantum  Statistics 


In  quantum  mechanics,  identical  particles  really  are  identical:  they  cannot  be  distin¬ 
guished,  even  in  principle.  This  loss  of  individuality  has  consequences  for  arguments  in 
which  they  are  counted.  Previously  we  have  focused  on  a  single  particle  and  asked  for  the 
chance  that  it  can  occupy  any  one  of  several  different  states  (and  so  calculated  the  mean 
number  of  particles  per  state).  We  must  now  take  two  things  into  account: 

-  the  particles’  indistinguishability, 

-  the  higher  particle  densities  encountered  in  systems  for  which  a  quantum  mechanical 
treatment  is  necessary. 


These  are  both  incorporated  by  shifting  focus  to  a  state ,  and  computing  the  chance  that 
it’s  occupied  by  some  number  of  particles.  In  (10.19)  we  wrote  down  the  probability  that 
a  single  quantum  state  of  energy  Es  and  volume  Vs  is  occupied  by  Ns  particles: 


~ES 

p  oc  exp - 


PVs  +  fiNs 
kT 


(15.1) 


Typically  the  volume  of  a  state  is  fixed;  the  state  is  occupied  by  n  particles,  each  of 
energy  E.  Then  (15.1)  becomes 


Pn  oc  exp 


—nE  +  pn 
kT 


=  exp 


—n(E  —  n) 
kT 


(15.2) 
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Write  this  last  equation  as 


Pn  = 


Z 


where  a  = 


E  —  p 

kT 


>0, 


(15.3) 


and  Z  is  the  (reciprocal  of  the)  normalisation,  called  the  partition  function,  that  we  met 
previously  in  Section  11.2.  Determine  it  by  writing 


1  =  XX  = 

n 


so  that 


E,,  e 


The  occupation  number,  being  the  mean  number  of  particles  in  a  state,  is 

£nne-™  -IdZ 


n  =  ^npn  = 


Z  da 


(15.4) 

(15.5) 

(15.6) 


This  is  an  example  of  the  utility  of  the  partition  function:  once  it’s  found,  other  quantities 
can  be  produced  from  it  by  simple  operations  such  as  differentiation. 


15.1  Two  Types  of  Fundamental  Particle 

Experiments  show  that  fundamental  particles  come  in  either  of  two  types: 


Fermions.  At  most  one  fermion  can  occupy  a  given  state,  so  (15.5)  becomes 

l 

Z  =  >  e  =  1  +  e  , 

n=0 

Hence  (15.6)  gives 

e~a  1 

H  ~  1  +  e~a  ea  +  1  ' 

Bosons.  Any  number  of  bosons  can  occupy  a  given  state,  so  (15.5)  becomes 


(15.7) 


(15.8) 


z  =  y,  e~na  = 

n= 0 


1  —  e 


and  n  = 


ea  —  1 


(15.9) 


The  occupation  number  can  be  written  for  both  particles  at  once  with 


(15. 


Fermions  turn  out  be  particles  with  odd  half-integral  spin,  such  as  electrons,  positrons, 
protons,  neutrons,  neutrinos,  and  muons.  Bosons  turn  out  to  be  particles  with  integer 
spin,  such  as  a  particles,  pions,  photons,  and  deuterons.  Fundamental  particles  don’t 


1 

J  fermions 

n  =  - 7? - 

exP  kT  ±  1 

1  bosons 
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seem  to  exist  with  any  other  spin  varieties  than  these  two,  so  they  are  all  either  fermions 
or  bosons.  Just  why  spin  should  determine  a  particle’s  occupation  number  is  analysed  by 
relativistic  quantum  mechanics,  but  the  fundamental  reason  is  not  well  understood.  The 
study  of  fermions  is  known  as  Fermi-Dirac  statistics,  while  that  of  bosons  is  Bose-Einstein 
statistics.  In  the  high-energy  limit,  or  low  particle-density  limit,  these  both  reduce  to  what 
we  have  been  studying  up  until  now,  Maxwell- Boltzmann  statistics. 


16  Blackbody  Radiation 

Hot  objects  contain  oscillating  charges,  and  oscillating  charges  radiate  electromagnetic 
waves.  An  obvious  example  is  the  light  emitted  by  the  hot  gas  that  makes  up  a  flame. 
The  electromagnetic  theory  of  just  how  this  occurs  is  complicated  even  for  very  simple 
systems;  but  given  that  a  huge  number  of  oscillating  charges  are  responsible  for  the  light 
emitted,  it  turns  out  that  we  can  use  statistical  mechanics  to  examine  such  macroscopic 
objects  within  electromagnetic  theory. 

But  the  emission  of  light  is  not  an  equilibrium  process;  there’s  a  continuous  transfer 
of  energy  from  the  object  to  the  waves,  which  is  then  lost  from  the  object.  We  have 
only  considered  equilibrium  processes  in  this  course,  and  the  subject  of  non-equilibrium 
processes  is  an  advanced  branch  of  statistical  mechanics.  However,  we  can  calculate  how 
much  light  is  emitted  from  a  hot  body  by  considering  a  related  process  that  does  occur  in 
equilibrium.  The  idea  that  allows  us  to  make  this  connection  is  the  Principle  of  Detailed 
Balance ,  which  is  discussed  in  [3].  Because  oscillating  charges  emit  light,  if  a  body  has 
charges  that  resonate  at  some  particular  frequency,  then  not  only  will  it  readily  emit  light 
of  that  frequency,  but  it  will  also  readily  absorb  light  of  that  frequency.  The  Principle  of 
Detailed  Balance  postulates  that  the  ability  to  emit  equals  the  ability  to  absorb: 

When  in  thermal  equilibrium  with  a  bath  of  electromagnetic  waves,  any  object — 
regardless  of  its  colour  or  makeup — emits  the  same  spectrum  and  intensity  that 
it  absorbs. 

So  we  will  derive  the  spectrum  emitted  by  a  hot  body  by  examining  a  related  scenario: 
the  spectrum  that  exists  inside  a  hot  oven.  The  walls  inside  the  oven  are  in  equilibrium 
with  the  radiation  inside,  so  we  can  use  statistical  mechanics  to  examine  that,  and  transfer 
what  we  learn  to  the  emitting  hot  body. 

Consider,  then,  a  perfectly  absorbing  (“black”)  body  placed  in  an  oven  that  is  ideal 
in  the  sense  that  it  is  “perfectly  emitting”  (which  we’ll  define  in  a  moment).  The  black 
body  must  emit  exactly  what  it  absorbs;  but  by  definition,  it  absorbs  all  the  radiation  it 
receives.  So  it  must  emit  the  same  spectrum  that  the  oven  produces.  But  the  mechanism 
for  how  the  body  emits  doesn’t  depend  on  the  oven,  so  the  black  body  must  therefore 
also  emit  identically  when  outside  the  oven.  We  conclude  that  the  spectrum  of  frequencies 
produced  by  a  black  body  equals  that  found  inside  an  ideal  oven. 

16.1  The  Radiation  Inside  an  Oven 

Next  consider  an  oven  (often  called  a  cavity)  with  hot  walls,  containing  radiation.  We 
wish  to  find  this  radiation’s  spectral  energy  density  g{f):  its  total  electromagnetic  energy 
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per  unit  frequency  per  unit  oven  volume.  Different  hot  materials  emit  different  amounts  of 
each  wavelength,  so  we  cannot  hope  to  use  only  general  arguments  to  obtain  the  spectral 
energy  density  of  an  oven  made  from  some  arbitrary  material.  Also,  we  cannot  expect  to 
discuss  the  emission  of  arbitrarily  low  frequencies  (long  wavelengths)  from  any  one  oven, 
it  being  problematic  to  discuss  a  light  wave  of  longer  wavelength  than  the  typical  size  of 
the  object  that  produced  it. 

What  electromagnetic  frequencies  exist  inside  the  oven?  It  might  be  argued  that  the 
electric  field  inside  a  metal  oven  will  go  to  zero  at  the  walls  since  otherwise  wall  currents 
would  arise  which  would  then  eliminate  the  field  at  the  walls.  That  would  quantise  the 
field  modes  so  that  only  certain  frequencies  would  be  present — although  for  all  practical 
purposes  they  would  form  a  continuum.  But  in  a  ceramic  oven,  the  field  need  not  go  to 
zero  at  the  walls,  and  so  the  frequencies  present  need  not  be  quantised.  On  the  other  hand, 
if  the  walls  inside  an  oven  are  reflective  enough  that  a  light  wave  inside  bounces  back  and 
forth  many  times,  it  will  be  reinforced  if  a  whole  number  of  wavelengths  fit  into  a  round 
trip.  Different  ovens  will  have  different  amounts  of  internal  reflectivity,  and  different-sized 
ovens  will  reinforce  some  wavelengths  but  not  others. 

The  task  of  calculating  a  spectral  energy  density  is  beginning  to  look  difficult!  To 
make  progress,  we  consider  an  idealised  oven  that  holds  a  continuum  of  wavelengths.  Its 
wall  oscillators  produce  light  of  all  frequencies;  this  light  bounces  about  inside  the  oven, 
sometimes  reflected  and  sometimes  not,  so  that  the  spread  of  frequencies  quickly  tends 
toward  some  equilibrium  distribution. 

The  following  argument  suggests  that  the  oven’s  shape  can  be  arbitrary.  Join  two 
differently  shaped  idealised  ovens  at  the  same  temperature  via  a  hole.  If  the  radiation 
spectra  of  the  two  differed  around  some  particular  frequency  (say,  yellow  light),  we  could 
presumably  introduce  a  filter  that  passed  that  frequency  only.  That  would  allow  a  flow 
of  energy  in  one  direction  through  the  hole,  which  would  perhaps  act  to  “unequalise”  the 
temperatures.  It’s  unreasonable  for  the  system  to  depart  from  thermal  equilibrium  in  such 
a  way — it  contravenes  the  Second  Law  of  Thermodynamics.  So  we  might  conclude  that 
there  cannot  be  such  a  flow  of  energy,  so  that  the  oven’s  shape  doesn’t  affect  the  spectrum 
of  radiation  inside. 

Actually,  this  argument  is  not  quite  as  straightforward  as  it  might  appear.  While  the 
filter  would  pass  yellow  light  into  the  oven  whose  walls  did  not  naturally  emit  much  yellow 
light,  the  Principle  of  Detailed  Balance  says  that  those  walls  wouldn’t  absorb  much  yellow 
light  either,  in  which  case  that  oven’s  temperature  need  not  increase.  Would  the  yellow 
light  then  build  up  inside  that  oven,  perhaps  interacting  with  the  filter  to  heat  it  up  until  it 
broke  down?  Also,  the  hot  filter  would  emit  radiation  of  its  own.  We  will  simply  postulate 
that  an  idealised  oven’s  spectral  energy  density  is  independent  of  its  shape,  and  appeal  to 
experiment  for  validation. 

In  that  case,  consider  an  oven  shaped  as  a  rectangular  box  with  side  lengths  Lx,  Ly ,  Lz 
and  volume  V  =  LxLyLz.  We  assume  that 

-  the  oven  walls  are  continuously  emitting  and  absorbing  radiation, 

-  the  oven’s  shape  doesn’t  affect  its  spectrum, 

-  there  is  no  restriction  on  what  frequencies  can  exist  inside  the  oven, 

-  the  walls  contain  a  huge  number  of  quantised  harmonic  oscillators, 
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-  at  thermal  equilibrium,  the  total  energy  of  the  oven  radiation  in  one  “frequency 
state”  (defined  soon)  in  the  frequency  interval  /  — ►  /  +  d/  equals  the  mean  thermal 
energy  e(/)  of  a  wall  oscillator  at  frequency  /.  That  is, 


[total  energy  of  radiation  in  /  — *•  /  +  df  ] — > 
[number  of  frequency  states  in  /  — *•  /  +  df  ] — > 


e(f)dfv 

9(f)  df 


£(/)» 


in  which  case 


8(f) 


e(f)a(f ) 

V 


(16.1) 


(16.2) 


We  calculate  the  mean  thermal  energy  e(f)  and  the  density  of  states  g(f )  as  follows. 


First  Requirement:  the  Mean  Thermal  Energy  of  an  Oscillator,  e(f ) 


Model  the  walls  of  the  oven  as  composed  of  a  set  of  quantum  oscillators  held  at  tempera¬ 
ture  T.  (That  is,  we  can  consider  the  walls  to  be  in  contact  with  a  heat  bath  at  T.)  Recall 
that  the  ntn  energy  level  of  the  quantised  oscillator  has  energy  (n  +  l/2)  hf ,  giving  it  a 
thermal  energy  of  nhf,  since  the  l/‘i  hf  is  the  ever-present  zero-point  energy  which  cannot 
be  taken  away  from  the  oscillator;  it’s  not  thermal  energy,  so  does  not  enter  our  analysis. 
Refer  to  Section  10.3.2  to  write  the  mean  thermal  energy  of  the  oscillators  as 


£(/) 


EOO 

n— 


71—0 

EOO 

71—0 


—  nhf 

kT  nhf 

—  nhf 

e  kT 


7  p  TICK 

h/£e  n 

J2ena  ’ 


with  a  = 


-hf 

kT  ' 


Because  eQ  <  1,  the  middle  denominator  in  (16.3)  is  simply  a  geometric  series: 


(16.3) 


OO 


71—0 


in  which  case  the  sum  in  the  middle  numerator  in  (16.3)  is 


(16.4) 


OO 

na  U  \  ^  not. 

e  n  =  >  e 

da 

71—0  71—0 


E 


_d _ 1 

da  1  —  eQ 


a\  2 

(1-e  ) 


(16.5) 


Hence  (16.3)  becomes 

e(f)  =  ~h T~  ■  (16-6) 

ekT  -  1 

Note  that  in  the  regimes  of  low  and  high  temperature, 

kT  <  hf  <*=>  e(f)  -*  0 

kT  »  hf  <t=4>  e(f)  ~  kT .  (16.7) 

The  first  equation  just  shows  that  the  thermal  energy  vanishes  as  T  — >  0,  and  the  second  is  the 
expected  result  for  a  classical  oscillator  (2  d.o.f.),  from  the  Equipartition  theorem. 


The  Schrodinger  equation  ascribes  an  infinite  number  of  energy  levels  to  a  harmonic  oscil¬ 
lator:  the  nth  level  has  thermal  energy  nhf.  The  quantum  statistics  view  of  Section  15  is 
that  the  oscillator  has  just  one  state,  which  can  be  occupied  by  any  number  n  of  particles, 
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called  photons ,  that  each  have  energy  hf.  The  expression  for  e(f)  in  (16.6)  implies  that 
the  mean  number  of  photons  of  energy  hf  in  the  oven  is 


n{f) 


mean  energy  of  photons 
energy  per  photon 


£(/) 

hf 


1 


M 

ekT  .  1 


(16.8) 


Compare  this  with  (15.10):  we  conclude  that  photons  are  bosons  with  chemical  poten¬ 
tial  n  =  0. 


Second  Requirement:  Density  of  States  g(f)  for  Light  Waves 

In  Section  3.1  we  derived  an  ideal  gas’s  density  of  states  g(E)  at  energy  E  by  calculating  the 
total  number  of  energy  states  in  the  energy  range  0  — *  E,  and  then  using  g{E)  =  0,[ot(E). 
We  calculated  the  total  number  of  states  by  postulating  that  each  state  could  be  repre¬ 
sented  by  a  cell  in  phase  space.  The  total  number  of  states  was  then  just  the  total  number 
of  cells,  which  equalled  the  total  phase  space  volume  able  to  be  occupied  for  energies  from 
0  to  E  divided  by  the  volume  of  one  cell.  Constructing  a  cell  wasn’t  quite  a  unique  affair, 
because  although  we  used  Planck’s  constant  h  to  give  the  cell  a  natural  size,  the  remarks 
on  page  20  showed  that  a  state  itself  was  not  very  well  defined.  And  yet,  happily,  this 
latitude  in  how  we  defined  a  state  had  no  effect  on  calculations  of  entropy  increase. 

We’ll  use  a  similar  argument  here  to  count  the  number  of  wave  states  in  the  oven.  Just 
as  for  the  ideal  gas  above,  there  is  some  latitude  in  how  we  can  define  a  wave  state.  The 
following  line  of  argument  is  accepted  in  statistical  mechanics  because  its  prediction  of 
how  much  radiation  exists  inside  an  oven  has  stood  up  well  to  experimental  tests. 

So,  as  in  Section  3.1,  calculate  the  density  of  states  g(  f  )  at  frequency  /  by  defining 
and  counting  the  total  number  of  states  in  the  frequency  range  0  — >  /,  and  then  use 
9(f)  =  <4t(/)- 

There  is  an  immediate  problem  in  defining  and  counting  wave  states.  With  the  range 
of  allowed  frequencies  being  continuous,  the  idea  of  a  frequency  state  doesn’t  immediately 
make  any  sense.  We  can  make  progress  by  identifying  each  wave  by  its  “wavenumber” 
vector  k  (usually  just  called  its  wave  vector),  whose  length  is  k  =  2tt/X  =  2nf/c  where 
c  is  the  speed  of  light,  and  which  points  in  the  wave’s  direction  of  travel. 

Why  use  k  to  calculate  the  density  of  states?  If  n  is  a  unit  vector  pointing  in  the  wave’s  direction 
of  travel,  could  we  define  a  “frequency  vector”  /  =  fn,  or  a  “wavelength  vector”  A  =  An,  and  use 
one  of  these  instead  to  characterise  the  wave?  It  turns  out  that  these  are  not  reasonable  quantities 
to  define.  Suppose  the  wave’s  direction  of  travel  has  angle  9  with  the  x  axis.  If  /  were  indeed  a 
vector,  we  would  probably  expect  that  its  x  component  fx  =  f  cos  6  would  be  the  frequency  of  the 
wave  crests’  intersections  with  the  x  axis;  but  that  frequency  is  in  fact  /!  Also,  if  A  were  indeed 
a  vector,  we  might  expect  that  its  x  component  A^,  =  A  cos  9  would  be  the  wavelength  of  the  wave 
crests’  intersections  with  the  x  axis;  but  this  wavelength  is  actually  A/cos  9.  So  neither  of  these 
would-be  “vectors”  /  or  A  is  particularly  meaningful,  and  that’s  why  physicists  don’t  define  them. 

But  the  appearance  of  cos  9  in  the  denominator  a  couple  of  lines  up  suggests  that  the  reciprocal 
of  wavelength  might  make  a  vector.  And  indeed  it  does:  enter  the  wave  vector  k  =  kn,  where 
k  =  2-k/X  is  the  wave  number.  The  2-7T  is  just  for  convenience,  but  the  A  in  the  denominator 
now  sends  the  cos  9  back  to  the  numerator  in  the  previous  paragraph’s  discussion  of  components. 
That  means  the  wave  vector’s  x  component  kx  =  k  cos  9  is  indeed  the  wave  number  of  the  crests’ 
intersections  with  the  x  axis:  kx  =  27r/ (wavelength  of  crests’  intersections).  So  k  behaves  just 
as  we  expect  vectors  to  behave.  That’s  why  it  is  so  useful  for  characterising  waves,  and  why  it’s 
found  everywhere  in  wave  theory. 
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Any  particular  wave  has  a  vector  k  =  ( kx ,  ky,  kz).  Perhaps  we  can  find  the  number  kltot(f) 
of  states  in  the  frequency  range  0  —>  /  by  counting  how  many  wave  vectors  can  fit  into  a 
sphere  in  3-dinrensional  “fc  space”  such  that  the  longest  vector  has  a  length  corresponding 
to  frequency  /:  this  length  will  be  k  =  2n/\  =  2irf  jc.  But,  of  course,  there  are  an  infinite 
number  of  such  vectors,  so  it  will  do  no  good  to  treat  each  one  as  a  separate  state.  We  must 
postulate  something  new:  that  the  vectors  can  be  “binned”,  grouped  into  cells  in  k  space. 
Each  cell  defines  2  states,  corresponding  to  the  2  possible  polarisations  that  a  wave  can 
have.  A  cell  is  defined  by  requiring  the  coordinates  (kx,ky,kz)  of  one  of  its  corners  to 
satisfy  a  certain  condition.  We  must  search  for  a  condition  that  leads  to  experimentally 
verifiable  predictions.  Consider  two  such  conditions,  which  both  lead  to  the  same  g(f) 
(which  is  eventually  given  experimental  validation). 

(a)  We  treat  the  oven  as  holding  a  continuum  of  travelling  waves,  and  require  the  coor¬ 
dinates  kx,  ky ,  kz  of  a  cell  each  to  be  related  to  a  whole  number  of  wavelengths  that 
fit  into  the  corresponding  side  lengths  of  the  oven.  That  is,  cells  that  are  adjacent 
along  say  the  x  axis  describe  waves  whose  numbers  of  wavelengths  fitting  into  a  side 
length  along  the  x  axis  differ  by  1.  In  this  case,  remembering  the  factor  of  2  for  the 
polarisations, 


ntot  =  2  x  number  of  cells 

^  volume  of  sphere  of  radius  k  =  27t/A  ^ 

volume  of  one  cell 

What  is  the  volume  of  a  cell?  We  need  to  know  each  of  its  edge  lengths.  This  is  a  question  that  arises 
frequently  in  statistical  mechanics.  We  can  answer  it  with  the  help  of  some  apparently  unrelated 
analysis:  if  we  have  some  function  z(x,  y) 


z  =  ax  +  by  +  c  (16.10) 

where  a,  b ,  c  are  constants,  how  does  2  increase  when  x  and  y  increase?  By  definition,  increases  A* 
and  Ay  in  x  and  y  cause  an  increase  Az  in  z,  so 

2  +  A2  =  a(x  +  A*)  +  b(y  +  Ay)  +  c .  (16.11) 

Subtracting  (16.10)  from  (16.11)  gives 


Az  =  aAxJrbAy.  (16.12) 

This  is  a  useful  expression  because  it  means  that  the  operation  of  finding  the  increase  is  linear. 
Linearity  is  a  central  theme  in  physics.  An  operation  L  is  linear  if,  for  constants  a  and  b, 


L(ax  +  by)  =  aL(x)  +  bL(y) . 


(16.13) 


It  suffices  to  have  just  two  terms  on  the  right  hand  side  of  (16.13),  but  it’s  easy  to  show  that  if  L  is 
linear,  then  it  can  be  applied  to  any  number  of  terms: 

L(ax  +  by  +  cz  +  . . .)  =  aL(x)  +  bL(y)  +  cL(z)  +  . . .  (16.14) 

We’ll  use  the  idea  that  A  is  linear  in  a  moment. 


With  A kx  being  the  increase  in  kx  along  the  side  of  a  cell  in  the  x  direction,  and 
similarly  for  y  and  z,  a  cell  must  have  volume  AkxAkyAkz.  We  can  calculate 
e.g.  A kx  by  writing  kx  in  terms  of  something  else  with  a  known  increase  along  a 
cell’s  x  direction.  We  haven’t  yet  used  the  requirement  that  a  whole  number  of 
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wavelengths  must  fit  into  the  corresponding  side  lengths  of  the  oven.  These  whole 
numbers  of  wavelengths  are  triplets  ( nx,ny,nz )  such  that 

na^a  =  La,  for  a  =  x,y,z.  (16.15) 


Thus 


27t  27rn0 

ka  =  T  =  ~T~ 

A^v  J-^rv 


Now  remembering  that  A  is  linear,  we  can  immediately  write 

27tA  na  27 r 

Ln  =  IT’ 


A  ka  = 


(16.16) 


(16.17) 


since  by  definition  Anx  =  1  when  we  move  along  a  cell  in  the  x  direction,  and 
similarly  for  y,  z.  So 


cell  volume  =  A  kx  A  ky  A  kz 


8tt 


LxLijLz 


8v r3 

~V~' 


Hence  (16.9)  becomes 


^tot  —  2  x 


4  (  27 rf  \3 

gn  c  ) 
8ir3/V 


8t vfV 
3c 


(16.18) 


(16.19) 


(b)  Alternatively,  we  treat  the  oven  as  containing  standing  waves  only.  In  that  case 
the  kx ,  ky ,  kz  each  specify  that  a  whole  number  of  /ia!/-wavelengths  fits  into  each 
dimension  of  the  oven.  For  this  case  we  use  only  positive  values  of  the  wave  vector 
components;  the  reason  is  because  while  a  standing  wave  is  comprised  of  superposed 
travelling  waves  of  opposite-sign  wavenumbers,  it  needs  only  the  positive  wavenum¬ 
ber  to  quantify  it.  Again  with  a  factor  of  2  for  the  polarisations, 


fltot  =  2  x  number  of  cells 

volume  of  one  octant  of  sphere  of  radius  k  =  2n/\ 

=  2  x  - - - | ,  16.20) 

volume  of  one  cefi 


since  positive  wavenumbers  comprise  just  one  octant.  Now  to  determine  the  cell 
volume  Akx  Aky  Akz,  realise  that  the  whole  numbers  of  wavelengths  (nx,ny,nz) 
satisfy 

n„Y  =  La>  for  a  =  x,  y,  z  ;  (16.21) 

thus 

^  =  2tt  =  7m«  ,  andsoAka=*.  (16. 22) 

/'Ct  ^OL  ^OL 

Then 

3  3 

cell  volume  =  A kr  A kv  A k,  =  — — - — —  =  —  ,  (16.23) 

y  LxLyLz  V 


and  (16.20)  becomes 


^tot  —  2  x 


7 T3/V 


8-rrfV 
3  c3 


(16.24) 
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just  as  obtained  for  condition  (a).  This  is  no  surprise;  in  going  from  condition  (a) 
to  (b),  we  reduced  the  volume  of  k  space  by  a  factor  of  8,  but  we  also  reduced  the 
volume  of  a  cell  by  the  same  factor,  so  f2tot  is  unchanged. 

Both  conditions  (a)  and  (b)  give  the  same  number  of  states  Dtot,  so  we  conclude  that  the 
density  of  states  is 

9(f)  =  ^tot (/)  =  87r^3  V  •  (16.25) 

c 

A  comment  on  this  calculation  of  g(f)  The  fact  that  conditions  (a)  and  (b) — and 
other  similar  conditions  that  can  be  used — give  the  same  number  of  states  indicates  that 
there  might  be  something  simple  hiding  behind  our  analysis.  Although  each  frequency 
in  a  continuum  can  have  only  infinitesimal  energy,  there  are  an  infinite  number  of  such 
waves  in  any  cell  in  k  space.  It  seems  that  in  our  idealised  oven,  perhaps  all  of  these  waves’ 
energies  integrate  to  give  the  same  energy  as  would  be  contained  in  just  one  standing  wave 
occupying  a  cell — although  granted  we  have  postulated  a  continuum  of  waves  in  the  oven. 
Even  so,  we  did  assume  on  page  66  that  at  thermal  equilibrium  the  total  energy  in  one 
frequency  state  is  determined  by  the  wall  oscillators,  and  nothing  was  said  about  whether 
this  energy  could  be  held  by  only  a  continuum  of  waves,  or  whether  standing  waves  could 
also  possess  it. 

Textbooks  usually  choose  one  or  both  of  conditions  (a)  and  (b)  above:  they  tend  to 
treat  the  oven  as  full  of  resonating  waves,  even  though  we  might  presume  that  an  ideal 
oven  would  be  made  of  perfectly  black  material  whose  walls  would  not  reflect  waves  at  all, 
and  so  would  not  allow  the  wave  to  resonate  by  bouncing  back  and  forth.  But  we  see  here 
that  this  assumption  of  resonance  is  not  actually  necessary.  In  analogy,  we  saw  on  page  20 
that  there  is  more  than  one  way  to  define  the  state  of  an  ideal-gas  particle  via  a  cell  in 
phase  space  by  using  any  multiple  of  Planck’s  constant,  but  that  luckily  what  results  is  a 
unique  expression  for  increases  in  entropy.  The  difficulty  in  defining  the  state  of  the  gas 
particle  is  due  to  our  insisting  on  counting  states,  which  entails  the  notion  of  a  discrete 
state. 

The  same  ideas  of  counting  apply  to  the  waves  in  the  oven.  Defining  and  then  counting 
their  states  by  constructing  discrete  cells  in  wavenumber  space  can  be  problematic — do 
we  use  wavelengths  or  half  wavelengths,  and  why  does  it  not  matter  what  we  do?  But  in 
the  end  we  have  some  kind  of  counting  procedure  that  gives  a  seemingly  unique  expression 
for  g(f),  and  this  expression  turns  out  to  produce  the  experimentally  verified  expression 
for  the  spectral  energy  density  g(f).  It’s  certainly  interesting  and  nontrivial  why  this 
should  be  so.  Ideas  of  counting  frequency  states  are  intimately  related  to  the  quantum 
mechanical  idea  of  representing  the  waves  by  a  “gas”  of  photons.  A  fundamental  difference 
between  photons  and  the  particles  of  our  ideal  gas  in  Section  3.1  is  that  the  number  of 
photons  in  an  oven  is  not  constant  over  time,  unlike  the  number  of  ideal  gas  particles  in 
a  container.  More  discussion  of  the  idea  of  photons  comprising  such  a  gas  can  be  found 
in  [4], 

An  Alternative  Approach  to  the  Maths  of  Counting  States  On  page  53  we  high¬ 
lighted  the  fact  that  calculating  the  density  of  states  g  by  differentiating  fitot  is  essentially 
the  same  as  calculating  dDtot.  Whereas  Dtot  requires  the  calculation  of  the  volume  of  a 
sphere  in  k  space,  dDtot  requires  the  calculation  of  that  sphere’s  surface  area. 
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The  same  idea  applies  here.  Our  calculations  of  g(f)  in  conditions  (a)  and  (b)  above 
were  really  no  different  to  counting  the  number  of  cells  within  a  thin  spherical  shell  of  ra¬ 
dius  A,  which  could  be  done  by  treating  the  volume  as  surface  area  times  shell  thickness  dA. 
In  essence  we  calculated  the  area  of  the  spherical  shell  by  differentiating  its  volume  4/3  7tA3 
with  respect  to  k  to  arrive  at  Airk  .  In  the  same  way,  textbooks  usually  calculate  g(f)  by 
considering  this  shell  of  k  space.  They  do  this  for  e.g.  condition  (a)  by  writing 


g(f)  df  =  2  x  number  of  cells  in  spherical  shell  of  radius  k  and  thickness  dA 

^  volume  of  spherical  shell  of  radius  A  and  thickness  dA 
volume  of  one  cell 

_2x4vrA2dA  _  2x4vr(^-)2^  8nf2Vdf 


8n/V 


8n/V 


(16.26) 


which  is  (16.25)  again  without  any  mention  of  f2tot  and  so  without  having  to  differentiate 
it  with  respect  to  /.  This  is  fine.  We  only  calculated  f2tot  to  emulate  and  reinforce  our 
approach  in  Section  3.1  for  calculating  the  number  of  energy  states  of  an  ideal  gas. 


The  End  Product:  Planck’s  Law 


Now  that  we  have  e{f)  in  (16.6)  and  g(f)  in  (16.25),  we  can  go  back  to  (16.2)  to  write 
Planck’s  law: 


£\  _  8nhf3/c3 
Q\f)  hj^ 

ekT  -  1 


(16.27) 


Compare  Planck’s  result  for  the  energy  density  with  the  prior  result  of  Rayleigh  and  Jeans,  who 
used  the  classical  expression  e(/)  =  kT,  based  on  equipartition  with  2  d.o.f.,  as  can  be  seen 
in  (16.7).  This  gave  them  an  energy  density  of 

<16 

c 

This  equation’s  very  wrong  prediction  of  arbitrarily  large  amounts  of  radiation  at  high  frequencies 
was  called  the  “ultraviolet  catastrophe”.  Planck’s  result  rested  on  energy  quantisation,  and  marked 
the  beginning  of  quantum  theory.  Note  that  Planck’s  expression  reduces  to  that  of  Rayleigh  and 
Jeans  in  the  low-frequency  limit. 


16.2  Total  Energy  per  Unit  Volume  of  the  Oven,  U 


The  total  radiation  energy  in  a  unit  volume  is  the  spectral  energy  density  integrated  over 
all  frequencies: 

/»oo 

U=  /  g(f)df.  (16.29) 

Jo 

Writing  x  =  M  converts  this  to 


U  = 


8ttA4T4 

3,  3 
C  h 


f°°  x3dx 


=  7r4/15 


8tt5A4  4 
15c3/?3 


(16.30) 


=  4 cr/c,  where  cr  is  the  Stefan-Boltzmann 
constant,  defined  in  (16.40)  ahead. 
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That  is,  the  total  photon  energy  inside  an  oven  of  volume  V  and  temperature  T  is 

(16.31) 

where  cr  =  5.67  x  10-8  Wm-2K  4  is  the  Stef  an- Boltzmann  constant ,  defined  in  (16.40) 
ahead. 

16.3  Planck’s  Law  in  terms  of  Wavelength 

Since  an  increase  in  frequency  d /  is  accompanied  by  a  decrease  in  wavelength  — dA,  where 
df  and  — dA  have  the  same  sign,  define  the  energy  density  over  wavelength  g( A)  by 

£>(A)(-dA)  =  g(f)  d/ ,  (16.32) 

which  leads  to 

(16.33) 

e\kT  —l 

where  we  have  used  /  =  c/A  and  df/dX  =  —c/X2.  Although  we  won’t  do  so  here,  it’s 
not  difficult  to  use  (16.33)  to  derive  Wien’s  law,  which  gives  the  “most  copiously  emitted 
wavelength”  A0.  We  simply  solve  ^(A0)  =  0  numerically  to  get 

(16.34) 

The  same  idea  serves  to  determine  the  “most  copiously  emitted  frequency”  /0  from  (16.27). 
The  result  is 

^  =  constant  ~  58.8  GHz  K-1.  (16.35) 

It  might  be  thought  that  /0A0  =  c,  but  such  is  not  the  case!  The  reason  for  this  apparent 
anomaly  is  that  the  phrase  “most  copiously  emitted  frequency”  implies  that  there  are 
various  frequencies  present,  like  balls  of  various  colours,  and  we  are  finding  the  ball  of  the 
most  common  colour — and  similarly  for  wavelength.  But  this  is  not  quite  the  case;  we  have 
assumed  that  frequency  and  wavelength  are  continuous,  so  that  the  equal-width  frequency 
bins  that  we  are  essentially  comparing  to  find  the  “most  copiously  emitted  frequency”  do 
not  map  to  equal- width  wavelength  bins,  since  frequency  and  wavelength  are  not  related 
linearly.  So  the  phrase  “most  copiously  emitted”  should  be  taken  with  a  grain  of  salt. 

16.4  Radiation  Exiting  the  Oven 

Make  a  small  hole  in  the  oven.  How  much  energy  escapes  per  second?  To  determine  this, 
place  the  origin  of  a  cartesian  coordinate  system  at  the  hole,  and  let  the  wall  containing 
the  hole  be  the  xz  plane,  with  the  y  axis  pointing  into  the  oven.  The  energy  passing 
through  the  hole  of  area  dA  in  a  time  At  is  that  of  some  of  the  photons  that  are  within  a 
distance  cAt  of  the  hole:  the  photons  of  interest  are  those  moving  in  the  correct  direction 
to  encounter  the  hole.  The  energy  escaping  the  hole  from  a  volume  dV  at  a  distance  r 
from  the  hole  is  then  (call  it  a)  determined  by  the  solid  angle  subtended  by  dA  as  seen 
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from  dV,  or,  equivalently,  the  area  that  dA  projects  onto  a  sphere  of  radius  r  centred 
at  dV: 

TT  ,T ,  “area  on  sphere”  subtended  by  dA  as  seen  from  dV 

a  =  UdV  x  - 1 - 2 — ~ - •  (16.36) 

Airr 

The  “area  on  the  sphere”  is  the  dot  product  of  the  hole  area  expressed  as  a  vector,  — dA  ey 
(where  ey  is  the  usual  y  basis  vector,  of  unit  length),  with  the  “look  direction”  from  dV, 
which  is  minus  the  radial  basis  vector,  —  er  (of  unit  length): 


“area  on  sphere”  =  — d Aey  ■  (— er)  =  dA(0, 1,0)  •  (sin  9  cos  <f>,  sin  9  sin  <fi,  cos  9) 

=  dA  sin  9  sin  4> ,  (16.37) 

where  0,  (/>  are  the  usual  spherical  polar  coordinates.  So  (16.36)  becomes 

TT  2  .  „  ,  , .  ,  ,  dA  sin  9  sin  6 

a  =  U  r  sin  9  dr  d9  d(f>  x  - 2 - .  (16.38) 

47rr 

That  means  the  total  energy  passing  through  the  hole  of  area  dA  in  a  time  At  is 

[  Am  ■  2  p  [*  -  ,  U  dA 

J  Jo  J  o  4vr 

=  U^dAAt.  (16.39) 

So  the  energy  radiated  per  unit  hole  area  per  unit  time  is  C/c/4.  In  other  words, 

power  radiated  per  unit  hole  area  =  -  ==2i  - — T4  =  crT4  ,  (16.40) 

4  15  c  n 

where  a  —  5.67  x  10  W m  K  is  the  Stefan-Boltzmann  constant. 

Note  that  although  the  U  here  is  the  total  energy  per  unit  volume  of  the  oven,  it  can 
also  stand  for  the  total  energy  per  unit  volume  of  the  oven  per  unit  frequency,  or  per  unit 
wavelength;  i.e.,  g(f)  or  g( A).  The  conversion  to  a  power  was  simply  accomplished  with 
the  factor  of  c/4.  For  example,  the  power  radiated  per  unit  hole  area  per  unit  wavelength 
is,  from  (16.33), 

^  =  (16.4!) 

4  e\kT  _  l 


16.5  Radiation  from  a  Black  Body  (usually  called 
“blackbody  radiation”) 

We  now  return  to  where  we  started,  with  the  task  of  calculating  how  much  radiation  is 
emitted  by  a  black  body.  We  argued  that  it  must  emit  what  it  absorbs,  which  is  thus  the 
same  (Planck)  spectrum  produced  by  the  oven.  So  we  infer  that  the  black  body’s  radiated 
power  equals  that  which  emerges  from  a  hole  made  in  the  side  of  the  oven.  That  means 
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the  power  radiated  by  a  black  body  per  unit  area  of  its  surface  is  cr T  ,  from  (16.40).  If  it 
has  area  A,  then 

total  power  emitted  by  black  body  =  AaT 4.  (16.42) 

A  body  that  isn’t  black  has  an  emissivity  e(A,  T),  typically  measured  experimentally.  The 
emissivity  is  sometimes  approximated  by  a  constant  e,  so 


total  power  emitted  by  any  body  ~  AeaT1. 


(16.43) 


Example:  The  sun’s  power  output  is  fitted  well  by  a  Planck  spectrum  for  T  =  5800  K, 
so  we  can  treat  it  as  a  black  body  with  this  surface  temperature.  It  has  radius  r  =  700,000  km, 
and  its  internal  temperature  (averaged  over  its  volume)  is  about  lO7  K.  If  the  sun’s  ther¬ 
monuclear  reactions  stopped  today,  how  long  could  it  continue  to  emit  at  its  current  rate? 

Use  (16.31)  and  (16.43)  to  write 


this  time  period 


total  photon  energy  inside 
total  current  luminosity 


4  3  4cr  /yu4 

^'Kr  XT  ^  internal 


4ttt  o  Tsurface 


4r 

3c 


4x7* 
3x3  = 


10' 

5800 


T. 

internal 
-^surface  , 

1 


31.5  = 


6  years 


~  870,000  years.  Answer 


(16.44) 


16.6  The  Greenhouse  Effect 


Consider  a  glass  greenhouse  with  its  ceiling  somewhere  above  the  ground,  along  with  an 
incoming  flux  density  Jj  (i.e.  power  per  unit  area)  of  solar  radiation.  Almost  all  of  the 
mainly  visible  light  of  the  solar  spectrum  passes  through  glass  without  being  absorbed  and 
re-scattered.  It  heats  the  ground — but  not  to  solar  temperatures  of  course,  so  the  ground 
re- radiates  an  “outgoing”  flux  density  JQ,  but  at  a  much  longer  wavelength:  typically 
largely  in  the  infra-red. 

Glass  absorbs  some  of  this  infra-red  light  and  re-radiates  a  portion  aJQ  back  to  the 
ground,  which  further  heats  the  ground.  When  equilibrium  is  reached,  there  can  be  no 
net  flow  anywhere,  so  consider  an  imaginary  plane  between  glass  and  ground.  The  flux 
density  down  through  this  imaginary  plane,  Jj  +  aJQ ,  equals  the  flux  density  up  through 
it,  Jq ,  so 

Jo  =  .  (16.45) 

1  —  a 

That  means  Ja  >  Jj.  Alternatively,  place  the  imaginary  plane  above  the  glass.  Now  what 
comes  down  through  this  imaginary  plane,  J%,  equals  what  goes  up  through  it,  (1  —  a)JQ , 
giving  us  (16.45)  again. 

Now  consider  JQ  with  and  without  the  glass  ceiling  present: 


oT4  (glass)  _  JQ  (glass)  _  _  1 

oT4 (no  glass)  Joino  Slass)  Ji  1~  a 


(16.46) 
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Therefore  the  glass  heats  the  ground  in  the  ratio 

r(eiass)  =  (  1  y/4  >  1 

T( no  glass)  yl  —  a  J 


(16.47) 


_ 2 

Example:  Earth  receives  a  mean  flux  density  of  solar  energy  of  Jj  =  175  Wrn  , 
averaged  over  all  latitudes  and  all  times  of  the  day  and  night.  Of  this,  90%  is  absorbed 
and  10%  reflected  back  into  space.  Assuming  Earth’s  infra-red  emissivity  is  0.9,  what 
would  be  the  average  temperature  on  Earth’s  surface  if  it  had  no  atmosphere? 

90%  of  175  Wrn  2  is  158  Wm-2.  At  equilibrium  this  must  all  be  re-radiated,  so 

0.9  crT4  =  158  Wnf2  .  (16.48) 


Thus 

/  15c  \ V4 

T  =  (  - o  I  K  =  236  K  =  — 37°C.  Answer  (16.49) 

V0.9x  5.67=7  == 

Now  introduce  our  atmosphere,  which  absorbs  and  re-radiates  almost  all  of  the  radiation 
leaving  the  ground.  What  temperature  results? 

In  this  case  a  =  4/2,  since  we  are  taking  all  of  Ja  to  be  absorbed,  and  half  of  this  is 
radiated  back  to  the  ground.  So  (16.47)  gives 


T(atmos.) 
T(no  atmos.) 


resulting  in 

T(atmos.)  =  21/4  X  236  K  =  281  K  =  8°C.  Answer 


(16.50) 

(16.51) 


16.7  Thermal  Noise  and  Maximum  Channel  Capacity 

Thermal  fluctuations  in  electrical  circuits  produce  noise  that  manifests  as  voltage  fluctu¬ 
ations.  To  explore  this,  model  a  circuit  resistor  as  a  one- dimensional  oven  of  length  L, 
carrying  electromagnetic  waves  as  before.  We  will  mimic  the  derivation  of  g(f)  of  Sec¬ 
tion  16.1,  but  it  will  be  simpler  this  time.  We’ll  also  focus  not  on  g(f)  by  itself,  but  on 
the  total  energy  inside  the  resistor  over  a  frequency  range  /  — : >  /  +  d /,  which  is  g(f)  d /  L. 
The  one-dinrensional  analogue  of  (16.2)  is 

g(/)  =  g(/)Lg(/)  .  (16.52) 

As  ever,  e(/)  is  the  mean  thermal  energy  of  an  oscillator,  now  inside  the  resistor  instead 
of  an  oven  wall.  This  is  again  (16.6).  Usually  we  are  concerned  with  sub-GHz  frequencies 
at  room  temperature,  for  which  hf  <C  kT,  so  that  e(f)  —  kT  (the  equipartition  value). 

The  discussion  beginning  on  page  67  of  how  to  calculate  the  density  of  states  g(f) 
applies  equally  here,  but  now  the  wave  vector  space  is  one  dimensional.  We  can  choose 
either  of  conditions  (a)  or  (b)  on  pages  68  and  69,  and  choose  to  calculate  Ptot  or  use  the 
alternative  approach  that  produced  (16.26).  The  results  will  all  be  the  same. 
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We’ll  choose  condition  (a)  and  calculate  f2tot.  A  continuum  of  waves  is  held  in  the 
resistor  (so  that  both  signs  of  k  are  used),  and  a  whole  number  n  of  wavelengths  fit  into 
the  resistor’s  length  L.  Again  remembering  there  are  2  polarisations, 


ntot  =  2  x  number  of  cells 


=  2  x 


But  nX  =  L ,  so 


,  2-7T  27TO 

fe  =  T  =  -r 


length  of  interval  [— k,  k]  Ak 

length  of  one  cell  (=  A k)  A k 

2ir 


and  therefore  A k  = 


L 


Also  k  =  2irf/c,  so  the  number  of  frequency  states  is 


^tot 


X  4  x  4/L 


A  k 


2vr 


Thus 


9(f)  =  VUf)  =  ~  • 
c 


(16.53) 

(16.54) 

(16.55) 

(16.56) 


Now  use  (16.52)  to  get  the  total  energy  inside  the  resistor  over  a  frequency  range  /  — ►  /  +  d/ 
for  the  sub-GHz  frequencies  of  interest: 


total  energy  within  resistor  in  d/  =  £?(/)  d/ T  =  e(/)  #(/)  d/  ~  kTALdf/c.  (16.57) 


Typically  we  require  the  total  energy  held  in  some  frequency  range  B,  where  B  is  called 
the  bandwidth.  Integrate  (16.57)  to  find  this  total  energy: 

total  energy  within  resistor  in  B  ~  4 kTLB/c.  (16.58) 


If  this  energy  all  emerges  in  a  time  L / c  by  moving  along  the  resistor  at  speed  c  (an  adequate 
approximation),  then 


energy  out 

power  out  =  - - — 

time  taken 


AkTLB/c 

Ljc 


4 kTB . 


(16.59) 


This  is  an  average  of  course;  it’s  all  based  on  the  idea  of  electromagnetic  fluctuations 
occurring  inside  the  resistor.  It  is  given  the  name  Nyquist’s  theorem  for  thermal  noise  in 
circuits : 

(16.60) 

This  noise  power  manifests  as  a  fluctuating  voltage,  since  the  power  dissipated  in  a  resis¬ 
tor  R  due  to  a  voltage  V  across  it  is  V2/ R.  In  that  case 

(V2/R)  =  AkTB  ,  (16.61) 


noise  generated  in  a  circuit  =  4 kT B  . 


so  that  the  mean-square  voltage  arising  from  the  noise  is 

(V2)  =  ARkTB  .  (16.62) 


The  noise  of  complicated  circuits  arises  from  many  sources  interacting  with  each  other  in 
various  ways,  so  that  more  generally  the  “4”  in  Nyquist’s  theorem  (16.60)  is  replaced  by 
the  circuit’s  noise  factor,  F. 
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Why  might  we  be  interested  in  having  some  particular  bandwidth  B ?  One  reason  in 
an  era  of  communication  is  the  Shannon-Hartley  theorem,  which  states  that  the  maximum 
transmission  capacity  C  that  some  data-transnritting  channel  can  have,  below  which  we 
can  always  arrange  for  an  arbitrarily  low  error  rate,  is  a  function  of  the  bandwidth  B  that 
we  use,  and  the  signal-to-noise  ratio  S/N  that  we  require: 


B\og2(l  +  S/N), 


bits/unit  time 


(16.63) 


where  a  “bit”  is  a  binary  digit:  0  or  1,  and  S,  N  are  the  signal  and  noise  powers  respectively. 
For  example,  if  we  want  to  send  a  signal  with  signal-to-noise  ratio  of  S/N  =  10,  and  we 
have  B  =  1  MHz  of  bandwidth  at  our  disposal,  then  the  maximum  throughput  for  which  we 
can  ever  hope  to  arrange  an  arbitrarily  low  error  rate  is  1  MHz  x  log2  11,  or  3.46  megabits 
per  second. 

By  “hoping  to  arrange  an  arbitrarily  low  error  rate”  we  mean  the  following.  When  signals  are  sent 
down  a  line,  errors  can  always  be  introduced  by  noise  en  route  and  in  the  receiver.  Sophisticated 
error-correction  algorithms  can  correct  some  of  these  errors;  but  the  higher  the  percentage  of  errors 
we  wish  to  correct,  the  more  sophisticated  our  algorithm  will  need  to  be.  The  Shannon-  Hartley 
theorem  puts  an  upper  bound  on  the  amount  of  information  that  we  can  ever  send,  even  if  we 
have  an  all-powerful  error-correction  algorithm  corrects  100%  of  the  errors. 

Equation  (16.63)  shows  that  to  achieve  a  high  transmission  rate,  we  should  have  a  large 
bandwidth. 

This  makes  sense  from  the  viewpoint  of  Fourier  analysis:  having  a  large  bandwidth  means  we  have 
a  large  range  of  frequencies  at  our  disposal,  and  that  means  we’re  able  to  craft  signal  waveforms 
with  “tighter  turns”  in  them.  So  for  example,  if  we  are  transmitting  a  square  wave  that  encodes 
binary  digits,  more  bandwidth  allows  us  to  squeeze  more  oscillations  of  the  wave  into  a  given 
length.  After  all,  signals  travel  at  a  set  speed,  so  if  we  want  to  send  more  of  them  per  second,  we 
have  to  make  each  one  shorter.  This  need  for  a  “broader  band”  of  frequencies  to  send  more  data 
is  the  origin  of  the  term  broadband  used  frequently  by  Internet  service  providers. 

But  in  an  electronic  circuit,  the  noise  N  is  given  by  FkTB.  So  increasing  the  bandwidth 
has  two  competing  effects:  it  partly  acts  to  increase  the  maximum  transmission  capacity  C , 
but  it  also  partly  acts  to  decrease  C  because  increasing  B  introduces  more  noise  into  the 
system,  through  Nyquist’s  theorem.  You  can  see  that  calculating  a  channel’s  maximum 
capacity  requires  knowledge  of  its  noise  factor  and  temperature. 


17  Theory  of  Electric  Conduction 

Here  we  outline  an  example  of  where  the  classical  picture  of  electric  conduction  fails,  and 
investigate  how  quantum  mechanics  steps  in  to  begin  to  solve  the  problem. 


17.1  The  Classical  Picture 

The  electrons  that  carry  a  current  through  a  wire  of  end  area  A  can  be  modelled  as  a  gas 
of  free  electrons  inside  a  metal  lattice. 
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Write 


n  =  electron  number  density, 
vd  =  electron  drift  speed, 
m  =  electron  mass. 


q  =  electric  charge, 
A  =  area  of  wire, 


(17.1) 


The  charge  crossing  A  in  a  time  At  equals  the  charge  contained  in  the  swept  volume  AvdAt, 
which  is  nqAvdAt.  The  electric  current  in  the  wire  is  then 


I  = 

and  the  current  density  is 


charge  passed  nqAvdAt 
At  ~  At 


J  =  -j  =  nqvd  . 


=  nqAvd , 


(17.2) 


(17.3) 


On  average,  vd  is  about  half  the  speed  picked  up  from  an  acceleration  due  to  a  force  Eq 
(exerted  by  the  electric  field  E)  that  acts  for  a  time  of  X/v,  where  A  is  the  mean  free  path 
of  the  electrons,  and  v  is  their  mean  thermal  speed  (from  the  Maxwell  speed  distribution). 
So 

1  Eq  A 

Vd  = 


and  therefore 


J  = 


2  m  v 


nEqz  A 
2  mv 


What  does  experiment  say?  Ohm’s  rule  gives  the  resistance  across  a  length  i  as 

v  _Et  _ql 
~  1  ~  ~JA  =  ^A' 

where  g  is  the  resistivity  of  the  conductor.  This  means  that,  experimentally, 

J  =  E/g. 


(!7.4) 


(17.5) 


(17.6) 


(17.7) 


This  agrees  with  (17.5),  which  predicts  J  oc  E.  So  far  so  good,  for  our  (classical)  micro¬ 
scopic  model  of  electric  current.  But  notice  that  (17.5),  (17.7)  also  imply 


E  2  mv 

6=  =  27 
nq  A 


J 


(17.8) 


We  know  that  v  oc  VT,  from  the  Maxwell  speed  distribution.  We  might  use  (13.2)  to  write 
the  mean  free  path  amongst  fixed  lattice  atoms,  with  v  of  these  atoms  per  unit  volume 
and  each  with  cross  section  a: 

(17.9) 


A  =  — . 

1/(7 


In  that  case,  we  conclude  that  g  oc  VT.  However,  this  is  experimentally  wrong.  Exper¬ 
iments  show  that  g  oc  T.  We  can  try  to  fix  this  by  modifying  the  mean  free  path  of  the 
electrons  among  the  lattice  atoms.  Suppose  an  electron  sees  a  lattice  atom  (mass  M ) 
vibrating  in  three  dimensions  with  circular  frequency  to  and  amplitude  A.  Equipartition 
gives  this  atom  a  vibrational  energy 


3/2  Mu2  A2  =  3/2  kT 


(17.10) 
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so  that  it  presents  a  cross  section  to  wandering  electrons  of 


2  vr  kT 

a  =  nA  = 


Mu 


2  • 


This  combines  with  (17.9)  to  give 


A  = 


Mu 

virkT 


(17.11) 


(17.12) 


With  this  temperature-dependent  A,  (17.8)  predicts  g  oc  T3^2.  But  this  still  disagrees  with 
experiment. 


17.2  The  Quantum  Picture 

Quantum  mechanics  clears  the  above  puzzling  aspects  of  conduction  in  great  depth  by 
incorporating  the  proposition  that  electrons  are  fermions.  This  proposition  can  be  taken 
as  confirmed,  thanks  to  the  experimental  success  of  the  predictions  that  result. 

In  Section  15  we  saw  that  at  most  only  one  fermion  can  occupy  a  given  quantum  state. 
This  is  a  severe  restriction  that  drastically  alters  the  quantum  behaviour  of  electrons  as 
compared  to  a  classical  treatment.  But  it  successfully  predicts  properties  of  conductors, 
semi-conductors,  and  insulators  that  are  not  predicted  classically.  We’ll  see  some  of  this 
in  what  follows. 

The  simplest  quantum  viewpoint  models  the  gas  of  electrons  as  noninteracting  particles 
in  a  cubic  infinite  square  well  of  side  length  L.  Solving  Schrodinger’s  equation  for  such 
a  “particle  in  a  box”  is  a  standard  and  straightforward  exercise  in  introductory  quantum 
mechanics  textbooks.  The  infinite  potential  energy  at  the  walls  of  the  box  constrain  each 
particle  to  the  box.  Inside,  the  zero  potential  energy  makes  Schrodinger’s  equation  easy 
to  solve  for  its  wave  functions  and  energy  eigenvalues.  These  eigenvalues  are  interpreted 
as  the  quantised  energy  levels  of  the  gas  of  particles,  and  are 


E, 


8  mL 


=  E, 


f  2  |  2  .  2  \ 

2  {nx  +  ny  +  nz)  . 


(17.13) 


How  many  electrons  can  fill  these  quantum  states  up  to  energy  E ?  This  number  of  electrons 
equals  the  number  of  states  Htot  with  energy  ^  E  (recall  Section  3.1),  which — with  two 
spins  allowed — is  twice  the  number  of  unit  cubes  in  nxnynz  space,  in  the  octant  of  radius 

n2  +  ny  +  nz>  i-e-  °f  radius  \J E/Ex.  This  number  is  twice  the  volume  of  the  octant  of 
positive  nx,nv,nz  out  to  this  radius,  or 


1  47T 


number  of  electrons  =  fitot  =  2  x  -  x  —  [  |  =  —  (  )  =  nL 


IT 


E 


3  \  E 


3/2 


(17.14) 


since  there  are  n  electrons  per  unit  volume  in  the  cube.  At  T  =  0  these  will  fill  the  lowest 
energy  states  up  to  the  Fermi  energy  EF,  which  in  this  context  is  just  another  name 
for  the  chemical  potential  g  at  this  temperature:  the  reason  is  because  a  plot  of  (15.10) 
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at  T  =  0  shows  that  the  occupation  number  changes  abruptly  from  1  to  0  at  energy  //.  So 
we  conclude  that  at  T  =  0, 


,  _  h 2  /  3n  \2/3 
F  8  m  \  7T  J 


(17.15) 


Example:  Calculate  Ep  (T  =  0)  for  copper  metal,  which  has  n  =  8.47  x  1028  electrons/m3. 

_ 21 

The  electron’s  mass  is  m  =  9.11  x  10~J±  kg. 

Equation  (17.15)  gives,  in  SI  units  with  a  final  conversion  to  electron  volts, 


(6.626=)2  /3  x  8.47^V/0  1 


2/3 


8  x  9.11 


-31 


7 r 


1.6=M 


eV  ~  7.0  eV. 


Answer 


(17.16) 


-  Note  that  the  gain  in  energy  from  lattice  atoms  ~  kT  (~  l/&o  eV  at  room  tempera¬ 
ture),  so  only  those  electrons  within  kT  of  Ep  can  be  excited  into  unoccupied  levels. 
(Incidentally,  the  characteristic  width  of  the  sloping  section  of  the  n(E)-vs-E  plot 
for  fermions  is  around  kT  as  well.) 

The  electrons  don’t  scatter  off  each  other.  The  reason  is  because  if  this  were  to 
happen,  pushing  one  into  an  unoccupied  level,  the  other  would  have  to  drop  to  a 
lower  unoccupied  level — which  doesn’t  exist,  because  the  levels  are  all  filled.  So  these 
electrons  don’t  obey  the  Equipartition  Theorem. 


Mean  Electron  Energy 


This  is 


total  e  energy  _  1  p 

z-  —  o’  /  E  dlz+ot 
total  no.  of  e  nLr  Jo 


(17.17) 


where  f2tot  is  the  number  of  electrons  out  to  some  energy  E.  Equation  (17.14)  gives 
dl!tot  =  7t/2  El  In  that  case, 


E  —  3^5  Ep  . 


(17.18) 


Fermi  Temperature  TF 

Define  this  via 

kTp  =  Ep  .  (17.19) 

If  T  -C  Tp,  the  mean  energy  of  the  lattice  atoms  (~  kT)  is  <C  Ep,  so  the  electron  occupa¬ 
tion  number  distribution  varies  little  from  its  T  =  0  shape.  For  copper, 

7  0x16  ~19 

Tp  ~  - - Ajo—  K  ~  81,000  K,  (17.20) 

1.38  = 

and  we  conclude  that  copper’s  gas  of  conduction  electrons  certainly  cannot  be  treated 
classically  for  any  reasonable  temperature  at  all. 
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Fermi  Speed  vF 

Define  this  via 

l/2  rrivp-  =  EF  . 


(17.21) 


For  copper, 


vF  = 


2Er 


m 


<2  x  7.0  x  1.6  = 
9.11  = 


m/s  ~  1600  krn/s  . 


(17.22) 


This  is  not  overly  affected  by  temperature,  at  least  for  T  <C  81,000  K. . .  Compare  this  to 
the  Maxwell  mean  speed  v  oc  T1^2 .  At  room  temperature, 


v  = 


'2  x  1.38=  x  300 
9.11  = 


m/s  ~  95  krn/s  . 


(17.23) 


2 

(Note  that  some  people  define  the  Fermi  speed  via  l/‘i  mvF  =  3/5EF.) 

The  quantum  mechanical  point  of  view  considers  it  more  appropriate  to  replace  the  v 
in  (17.8)  with  vF.  Since  vF  is  effectively  independent  of  T,  and  (17.12)  gives  A  oc  1/T, 
(17.8)  becomes 


2  mvF 

B  =  — 2V  (xT’ 
nq  A 


as  observed  experimentally. 


(17.24) 


So  quantum  mechanics  has  come  to  the  rescue  by  predicting  the  correct  dependence  on 
temperature  for  the  electrical  resistivity. 


17.3  Band  Theory  of  Solids 

To  understand  the  degree  of  availability  of  free  electrons,  we  must  consider  the  effect  of 
the  crystal  lattice  on  the  electron  energy  levels.  Solving  the  Schrodinger  equation  for  an 
electron  in  an  atom  yields  a  discrete  set  of  energy  levels.  However,  when  we  bring  two 
atoms  close  together,  the  energy  of  each  level  changes  due  to  the  influence  of  the  other 
atom.  (We  can  also  arrive  at  this  by  solving  the  Schrodinger  equation  for  electrons  moving 
in  a  periodic  potential.) 

If  we  bring  N  atoms  together  in  a  lattice,  a  particular  energy  level  splits  into  N  levels, 
forming  a  band.  The  band  width  is  determined  by  the  atomic  spacing — not  by  N,  so  for 
large  N  the  band  is  composed  of  an  almost  continuous  spread  of  energy  levels.  Bands  are 
typically  a  few  eV  thick,  and  may  overlap. 

One  piece  of  direct  evidence  for  bands  in  solids  comes  from  X  ray  spectra.  Gaseous 
sodium  shows  the  expected  sharp  peaks  due  to  energy  level  quantisation,  whereas  the  same 
peaks  produced  from  solid  sodium  are  broadened  due  to  the  bands  being  present. 

Allowed  bands  are  continuous  bands  of  energy  levels  for  electrons. 

Forbidden  bands  are  regions  where  there  are  no  energy  levels. 

The  band  containing  the  outer  electrons  is  called  either  the  valence  band  if  it’s  full  of 
electrons — i.e.  if  all  of  its  energy  levels  are  occupied;  or  the  conduction  band  if  it  isn’t  full. 
Valence  bands  correspond  to  insulators  and  semiconductors.  Conduction  bands  correspond 
to  conductors. 
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We  treat  materials  as  well  and  truly  insulators  when  the  width  of  the  first  forbidden 
band  above  their  valence  band  is  >  2  eV.  (E.g.  the  width  for  diamond  is  7  eV.)  If  the 
width  is  <  2  eV,  we  call  the  material  a  semiconductor.  Examples  are  silicon  (1.1  eV)  and 
germanium  (0.7  eV). 


17.4  Insulators  and  Semiconductors 


Consider  a  material  whose  (filled)  valence  band  extends  to  energy  Ev.  There  is  then  a 
gap  of  width  Eg  which  forms  a  forbidden  band,  and  then  the  (almost  empty)  conduction 
band  begins  at  energy  Ec.  By  symmetry,  the  Fermi  energy  Ep  lies  in  about  the  middle 
of  the  gap,  so  that  EF  =  Ev  +  Eg/ 2.  What  is  the  number  density  ne  of  electrons  in  the 
conduction  band?  Set  /3  =  l/(kT)  to  write 


n 


e 


l  r°°  l  r 

73/  n(E)g(E)dE  =  - 3/ 

L  jec  L  Jel 

-4  [  e~l3iE~EF)g{E)dE . 


9(E)  d E 
e P{e—ef)  ^ 


(17.25) 


The  density  of  states  g(E)  depends  heavily  on  the  material.  Remember  from  Section  3.1 
that  g(E)  d E  =  d!ltot.  For  the  simple  model  of  particles  in  a  box,  Otot  is  given  by  (17.14): 


«tot  =  |  ^ 


E  \3/2 
Ep 


.  7 r  E1/2 

so  g(E)  =  - 


E 


3/2 


What  is  more  usual  is  to  model  the  possibly  complicated  density  of  states  by 


(17.26) 


9(E) 


7T  (E-Ec)1/2 

2  eT 


(17.27) 


Also,  the  electron  mass  m  (inside  El )  is  replaced  by  some  effective  mass  m  that  depends 
on  the  nature  of  the  lattice.  In  this  new  model,  the  electron  number  density  of  (17.25) 
becomes 


nP  ~ 


rC°e-0(E-EFP  (£-£c)‘/2 


E 


3/2 

1 


d  E. 


(17.28) 


Write  E  —  Ep  =  E  —  Ec  +  Eg/ 2  and  use  a  change  of  variables  u  =  (E  —  Ec)1^2.  The  inte¬ 
gral  is  then  straightforward,  and  gives 


n 


kT 

~Ep 


—En 


2  kT 


(17.29) 


Example:  Calculate  ne  at  room  temperature  using  the  electron  number  density  and 
Fermi  energy  of  copper,  for  the  two  cases  of  forbidden  band  widths  7  eV  (an  insulator) 
and  1  eV  (a  semiconductor). 

We  use  kT  =  1/40  eV,  n  =  8.47  x  1028  nW3,  and  Ep  =  7  eV: 

Insulator:  Eg  =  7  eV,  so 

_  O  ^28  ^0.025  Y^2  -7  0  n— 36 

ne  ~  8.47  =  (  -  )  exp  ~  3x10  .  (17.30) 
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Semiconductor:  Eg  =  1  eV,  so 

0^28^  0.025  Y/2  1  ^  n16 

ne  ~  8.47=  I  -  J  exp—  w  4x10  .  (17.31) 

The  comparatively  huge  number  of  conduction  electrons  in  the  semiconductor  is  evident. 
Their  number  is  also  affected  by  temperature:  materials  that  are  insulators  as  T  — ►  0  can 
become  semiconductors  as  the  temperature  rises. 

17.5  Diodes 

Electrons  that  jump  across  the  gap  into  the  conduction  band  leave  behind  a  “hole”  in 
the  valence  band.  This  can  be  treated  as  another  particle,  but  with  positive  charge,  that 
contributes  to  the  current. 

By  adding  impurities — small  amounts  of  other  elements — to  the  semiconductor,  we 
can  cause  more  or  less  free  electrons  to  move  within  the  lattice  (actually  orbiting  say  an 
arsenic  atom  at  a  large  distance),  or  holes  to  move  within  the  lattice  (actually  orbiting 
say  a  gallium  atom  at  a  large  distance).  These  donor  impurities  create  electron  energy 
levels  which  allow  electrons  to  be  excited;  thus  they  cause  semiconductors  to  conduct  much 
better  than  if  the  conduction  were  due  to  thermal  excitation  alone. 

An  n-type  semiconductor  is  one  that  has  been  “doped”  with  an  element  such  as  arsenic 
or  antimony,  elements  that  donate  electrons  (negative  change — hence  the  name  n-type). 
These  donated  electrons  populate  new  energy  levels  that  appear  at  the  top  of  the  forbidden 
band.  Thermal  excitation  or  an  external  electric  field  can  quite  easily  bump  these  electrons 
into  the  conduction  band;  hence  the  dramatic  increase  in  conduction  due  to  the  impurity. 

A  p-type  semiconductor  is  doped  with  e.g.  gallium  or  indium:  metals  that  accept 
electrons  or,  equivalently,  donate  holes:  positive  change — hence  the  name  p-type.  These 
holes  populate  new  energy  levels  that  appear  at  the  bottom  of  the  forbidden  band.  Thermal 
excitation  or  an  external  electric  field  can  now  easily  bump  electrons  from  the  top  of  the 
valence  band  into  these  holes. 

Suppose  we  have  two  doped  semiconductors:  the  p-type  has  holes  that  are  free  to 
wander  about  its  lattice,  and  the  n-type  has  electrons  that  are  free  to  wander.  If  we  join 
them  together  to  form  a  pn-semiconductor,  some  of  the  free  electrons  in  the  n-type  close 
to  the  junction  will  move  to  fill  the  immediately  adjacent  holes  on  the  p-type  side  of  the 
junction.  This  creates  a  slight  excess  of  negative  charge  on  the  p-type  side  of  the  junction, 
and  a  slight  excess  of  positive  charge  on  the  n-type  side  of  the  junction.  A  permanent 
internal  electric  field  has  now  been  created  across  the  junction,  pointing  from  the  slightly 
positive  n-type  side  to  the  slightly  negative  p-type  side. 

There  are  now  two  processes  continuously  occurring  across  the  junction: 

Thermal:  Free  electrons  from  the  n-type  side  are  always  being  propelled  by  thermal 
fluctuations  to  join  the  excess  negative  charge  on  the  p-type  side.  That  serves  to 
increase  the  field’s  strength.  The  Boltzmann  distribution  can  be  applied  to  determine 
how  many  thermal  electrons  are  propelled  across  the  junction. 

Electromagnetic:  The  strong  field  in  turn  keeps  returning  electrons  from  the  p-type 
side  back  home  across  the  junction.  That  acts  to  reduce  the  field’s  strength,  and  the 
whole  situation  is  in  a  steady  state. 
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The  energy  term  in  the  Boltzmann  distribution  (10.20)  will  be  the  increase  in  potential 
energy  that  the  n-type’s  free  electrons  “see”  from  where  they  are  to  the  far  side  of  the 
junction  before  they  are  thermally  propelled  across. 

Note  that  I  am  being  descriptive  in  my  wording  here  because  the  details  are  easy  to  get  confused. 

The  field  across  the  junction  points  from  n-type  to  p-type;  that  means  the  n-type’s  free  electrons 
experience  a  drop  in  electromagnetic  potential  when  they  are  forced  across  the  junction  by  thermal 
fluctuations.  But  because  they  have  negative  charge,  their  potential  energy  will  increase  during 
this  process. 

Should  the  energy  term  in  (10.20)  be  the  potential  energy  seen  on  the  far  side  of  the  junction 
by  the  electrons,  or  the  potential  energy  increase  that  they  see  across  the  junction?  In  fact  either 
will  work.  Remember  that  potential  energy  is  only  ever  defined  up  to  an  additive  constant.  We  can 
include  this  constant  in  the  exponential  in  (10.20),  but  it  will  only  get  absorbed  into  the  constant 
of  proportionality  in  that  equation.  So  we  might  as  well  set  it  equal  to  zero. 

So  set  the  potential  energy  of  a  free  electron  on  the  n-type  side  of  the  junction  to  zero.  It 
then  sees  a  potential  energy  on  the  p-type  side  of  U0  >  0. 

We  can  envisage  the  continuous  thermal  and  electromagnetic  flows  of  electrons  as 
follows.  (In  the  next  paragraphs,  it  helps  to  consider  the  current  of  electrons  as  a  particle 
current,  so  we’ll  use  a  lower-case  i  to  refer  to  electron  current,  and  an  upper-case  I  to  refer 
to  conventional  circuit-theory  current.) 

Thermal:  A  current  ^thermal  >  0  of  electrons  going  from  n  to  p  (i.e.  this  is  not  electric 
current  of  the  “conventional”  sign)  due  to  thermal  fluctuations  boosts  these  electrons 
into  a  higher  potential  energy.  The  number  of  electrons  forming  the  current  obeys 
the  Boltzmann  distribution,  which  closely  approximates  the  tail  of  the  Fermi-Dirac 
distribution  for  the  electron  occupation  number  at  energies  above  the  forbidden  band, 
or  >  EF. 


Electromagnetic:  This  current  of  electrons  i0  >  0  (again  not  “conventional”  electric  cur¬ 
rent)  flows  from  p  to  n  across  the  junction,  driven  by  the  permanent  internal  electric 
field.  A  typical  value  of  i0  is  about  a  milliamp. 

In  equilibrium  these  flows  are  balanced,  so  there  is  no  net  current  at  all. 

Suppose  that  now  we  apply  a  bias  voltage,  by  connecting  the  p-type  to  one  terminal 
of  an  electric  cell  of  voltage  Vy  >  0,  and  the  n-type  to  the  other  terminal.  We’ll  “for¬ 
ward  bias”  the  diode,  connecting  the  p-type  to  the  positive  terminal.  Now  the  potential 
energy  of  an  electron  on  the  p  side  of  the  junction  decreases  from  U0  to  U0  —  eV),  (where 
e  =  1.6  x  10  19  C).  This  lowers  the  potential  barrier  for  the  n-type’s  free  electrons  to 
form  the  thermal  current,  but  doesn’t  affect  i0,  which  is  a  kind  of  ever-present  background 
current  due  to  the  base  conditions  existing  internally  to  the  junction.  Set 

I  =  conventional  electric  current  through  diode  from  p-type  to  n-type 
=  electron  current  through  diode  from  n  to  p 

^thermal  ^'0  •  (17.32) 

But  notice  that 

—A  pot.  energy  (n— >p) 

^thermal  exp 
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=  C  exp 


~(Uo  ~  eVb) 

kT 


(17.33) 


for  some  normalisations  C,C'.  For  the  case  of  no  bias,  ^thermal  =  C  which  therefore 
equals  i0,  because  1  =  0  with  no  bias.  That  means  (17.32)  can  always  be  written 


I  =  i0e 


— 


(17.34) 


The  case  of  a  negative  value  of  Vb  is  a  reverse  bias,  meaning  the  p-type  has  been 
connected  to  the  negative  cell  terminal  (and  n-type  to  positive).  Here  the  potential  energy 
step  that  the  electrons  must  jump  (n— >p)  gets  higher,  so  thermal  ~ >  0  and  the  only  current  I 
across  the  junction  is  the  small  background  electric  current  —  i0  (the  negative  sign  is  due 
to  I  being  conventional  electric  current).  So  the  diode  passes  almost  no  current  when  it’s 
reverse  biased,  as  (17.34)  shows.  Commercial  diodes  can  be  quite  robust  when  reverse 
biased,  and  will  only  pass  1  or  2  milliamps  even  when  Vb  equals  minus  several  hundred 
volts. 

But  when  the  diode  is  forward  biased,  the  potential  energy  barrier  seen  by  the  elec¬ 
trons  drops.  Given  the  exponential  nature  of  the  Boltzmann  distribution,  this  allows  a 
huge  number  of  thermally  excited  electrons  to  jump  up  the  potential  step.  So  a  forward 
bias  produces  a  current  /  which  is  typically  several  amps  for  Vb  =  +1  volt,  and  which 
in  principle  can  be  huge  as  (17.34)  shows;  whereas  a  reverse  bias  produces  essentially  no 
current  at  all.  Diodes  thus  pass  current  pretty  much  in  one  direction  only,  which  makes 
them  very  useful  in  electronic  devices. 


18  Final  Comments  and  Acknowledgements 

Historically,  statistical  mechanics  resulted  from  physicists’  efforts  to  put  thermodynamics 
onto  a  more  mathematical  footing.  One  thing  I  have  tried  to  emphasise  in  these  notes  is 
that  this  mathematical  basis  should  not  be  taken  as  implying  that  statistical  mechanics 
is  a  completely  “closed”  subject,  whose  concepts  are  now  completely  well  defined,  easy  to 
calculate,  and  subject  to  the  application  of  endless  rigor. 

For  example,  I  put  some  emphasis  on  explaining  the  difficulties  involved  with  counting 
states  in  Section  3.1.  The  states  of  simple  systems  can  certainly  be  counted,  but  in  general 
it  seems  to  be  impossible  to  count  the  states  of  more  complex  systems  exactly,  even  in 
some  idealised  way.  A  good  example  of  this  difficulty  is  the  standard  expression  for  the 
entropy  of  an  ideal  gas,  not  covered  in  these  notes  but  which  is  found  in  some  textbooks, 
such  as  [5] .  What  might  be  surprising  is  that  this  entropy  expression  fails  the  Third  Law  of 
Thermodynamics,  because  it  does  not  vanish  at  zero  temperature.  Its  derivation  is  based 
on  setting  the  gas’s  number  of  states  at  a  given  energy  E  to  be  what  I  have  called  kltot(E) 
instead  of  the  more  correct  tt(E),  so  is  headed  in  the  wrong  direction  from  the  very  start. 
Given  that  the  counting  procedure  used  to  derive  that  standard  expression  for  the  gas’s 
entropy  is  already  an  approximation,  we  can  certainly  introduce  further  approximations 
to  make  the  entropy  vanish  at  zero  temperature.  But  it’s  not  clear  whether  any  such 
expression  for  the  entropy  of  an  ideal  gas,  that  is  as  correct  as  it  can  be  at  all  temperatures 
given  the  relevant  idealisations,  exists  in  the  literature.  The  subject  routinely  hardly  rates 
a  mention  in  textbooks.  If  we  already  meet  with  difficulty  when  calculating  the  entropy 
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of  an  ideal  gas,  we  might  expect  bad  things  farther  down  the  road.  And  yet  in  spite  of 
this,  statistical  mechanics  is  very  successful  at  predicting  and  shedding  light  on  some  of 
Nature’s  very  complex  behaviour. 

Another  example  of  the  difficulties  underlying  the  subject  can  be  found  in  my  exposi¬ 
tion  of  blackbody  radiation.  Every  discussion  of  this  subject  that  I  have  seen  rests  on  the 
idea  of  electromagnetic  field  nodes  at  the  oven  walls.  There  seems  no  reason  to  assume  this 
(especially  for  non-metallic  walls),  but  if  such  nodes  do  exist  at  the  walls,  then  Planck’s 
law  would  already  have  to  be  a  continuous  approximation  to  what  is  really  a  discrete  func¬ 
tion  dependent  on  oven  size.  Also,  the  necessary  introduction  of  the  concept  of  emissivity 
shows  that  Planck’s  law  applies  to  an  idealised  oven  only,  for  emissivity  varies  with  the 
material  of  the  emitter,  and  is  even  a  function  of  wavelength  for  a  single  emitter.  The 
key  question  seems  to  be  just  how  to  define  such  an  idealised  oven.  I  have  stressed  this 
approach  in  these  notes. 

Parts  of  these  lectures  owe  much  to  those  given  by  Graeme  Putt  and  Paul  Barker  of 
Auckland  University’s  Physics  Department  during  my  own  undergraduate  physics  degree 
there.  My  discussions  of  the  course  topics  here  have  benefited  from  conversations  with 
Sanjeev  Arulampalam,  Shayne  Bennetts,  Scott  Foster,  David  Griffiths,  Alex  Kalloniatis, 
Roland  Keir,  Jim  McCarthy,  Jamie  Quinton,  Andy  Rawlinson,  Nikita  Siniakov,  Keith 
Stowe,  Alice  von  Trojan,  and  Vivienne  Wheaton.  I  wish  to  thank  DSTO  and  Flinders 
University  for  the  opportunity  to  give  this  lecture  course.  The  clarity  of  these  notes  was 
also  greatly  improved  by  the  feedback  of  my  class,  who  were  an  interested  group  of  students 
and  a  delight  to  teach.  There  seemed  to  be  no  time  to  really  teach  everything  properly  in 
the  few  months  in  which  I  gave  the  course,  but  the  students  were  always  willing  to  listen  to 
my  endless  asides  and  my  “here  he  goes  again”  digressions  into  semi-relevant  mathematics. 
At  least,  that’s  the  way  it  seemed. 
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phase  space,  8 
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relative  fluctuation,  4 
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degeneracy  of,  7 
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spacing  in  energy,  7 
statistics 
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Fermi-Dirac,  64 
Maxwell-Boltzmann,  64 
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Stirling’s  rule,  3 
stoichiometric  coefficients,  37 
stress-energy  tensor,  59 
sun  as  blackbody,  74 


temperature,  19 
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thermal  conductivity,  26,  59 
thermal  energy,  11 
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thermal  equilibrium,  19 
thermal  resistance,  27 
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ultraviolet  catastrophe,  71 
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